Four phases. One system.
Click any slice on the pizza to explore how the architecture is built, one phase at a time.
Overview
Phase 1 converts raw Java code into structured, queryable intelligence. Every source file is walked, parsed, normalized, stored in a property graph, and vectorized — building the dual backbone the query engine relies on.
Pipeline
Repo→
JavaParser→
AST→
Normalize→
Neo4j+
Qdrant
Steps
01
Parse Code — JavaParser
The entire repository is scanned. Every .java file is parsed into a structured AST-like JSON.
// Output: AST-like JSON
{
"class_name": "PaymentService",
"methods": [{
"name": "processPayment",
"calls": ["validationService.validate", ...]
}]
}
02
Normalization — ASTParser
Raw JSON is cleaned and standardized. Method calls are resolved to fully qualified names (FQN).
03
Graph Building — Neo4j
Normalized data is converted into a property graph. Nodes represent Services and Methods. Edges represent CALLS and USES relationships.
04
Embeddings — Qdrant
Text summaries are converted into dense vectors using all-MiniLM-L6-v2 and persisted in Qdrant for Phase 2.
Key Insight
AST reveals structure. Normalization makes it usable. Graph encodes relationships. Embeddings encode meaning.
Stack
JavaParser
Neo4j
Qdrant
Python
Overview
Phase 2 uses the indexed data from Phase 1 to answer questions, analyze impact, and provide contextual insights. It transforms stored knowledge into precise answers.
Pipeline
Query→
Qdrant→
Neo4j→
Context Fusion→
LLM
Steps
01
Input (Query or Change)
Input can be a natural language query or a specific code change (diff + changed methods).
02
Retrieval & Graph Traversal
We perform a cosine similarity search in Qdrant, then enrich the results using Neo4j to find downstream method calls and cross-service dependencies.
// Enrichment Path
PaymentService.processPayment
→ CALLS → ValidationService.validate
03
Context Fusion & LLM
Retrieved semantic context and structural graph relationships are combined into a structured context block to improve LLM accuracy.
04
Analyze Mode (Impact)
For code changes, we compute the "Blast Radius" using the graph to identify affected downstream components.
Key Insight
Vector search provides semantic understanding. Graph traversal provides structural accuracy. Together, they enable context-aware reasoning.
Stack
Qdrant
Neo4j Cypher
OpenAI
LangChain
Overview
It is an execution engine that connects user intent to backend intelligence, safely applies code changes, and maintains synchronization with the workspace. The frontend is not a simple UI — it is an orchestration layer.
Architecture Flow
Webview UI→
Ext Host→
Backend API→
Git Execute
Core Responsibilities
01Frontend Interaction
Captures user input, handles chat interface and message rendering, manages model selection (e.g., ripple-core, sentinel-x), and displays action buttons like Apply Fix and Insert Code.
02Communication Layer
Sends messages from the UI to the extension host using postMessage and receives updates via onDidReceiveMessage.
03Extension Host Orchestration
Handles webview messages, builds API requests, calls backend endpoints, applies file changes safely, and manages the full workspace context.
04Safe Execution & Git
Directly executes git commands for changes. Collects dependency manifests and sends them to the backend. Edits are always applied in descending line order to avoid offset issues.
Security Division
The frontend never performs security scanning — it only forwards data and applies fixes. Trivy runs exclusively on the backend, alongside Neo4j and Qdrant which power the analysis. The extension orchestrates everything.
Stack
VS Code WebviewpostMessage
Extension HostGitTrivy (Backend)
Overview
Phase 4 is the MCP bridge layer — a deliberately thin stdio adapter that connects Cursor's AI to the CORTEX FastAPI backend. All intelligence lives in the backend; MCP is just the door Cursor walks through to reach it.
Architecture Flow
Cursor→
stdio JSON-RPC→
MCP Server→
HTTP httpx→
FastAPI
Steps
01
Cursor Launches the Process
Cursor reads your MCP config and spawns server.py as a child process. Communication happens over stdio — JSON-RPC messages on stdin/stdout. No HTTP port, no network — just pipes.
# MCP config → Cursor launches
python .../cortex_mcp/server.py
# speaks JSON-RPC over stdin/stdout
02
FastMCP Handles the Protocol
The SDK's FastMCP handles all protocol machinery — handshakes, tools/list, tools/call, error framing. Tools are registered with @mcp.tool() decorators. Server instructions tell Cursor when to call what.
mcp = FastMCP("CORTEX AI", instructions="...")
@mcp.tool()
def cortex_query(question: str) -> str:
"""Ask a natural language question."""
...
03
Each Tool = One HTTP Call
Tool handlers build an httpx client against CORTEX_API_BASE (default 127.0.0.1:8080/api/v1), optionally add Authorization: Bearer, and forward arguments as JSON to routes like /query, /analyze, /index-repo.
def _client():
return httpx.Client(
base_url="http://127.0.0.1:8080/api/v1",
headers={"Authorization": f"Bearer {TOKEN}"}
)
# tool just forwards and returns
r = c.post("/analyze", json={...})
return r.text
04
Two Processes Must Be Running
MCP server is managed by Cursor. The FastAPI backend must be running independently via uvicorn. If the backend is down, MCP tools still respond — but return HTTP connection errors as text.
Key Insight — Thin Bridge Pattern
Most MCP servers embed their capability in-process (read files, query a DB). CORTEX's server is a pure proxy — no Neo4j, no Qdrant, no RAG runs inside it. This keeps the MCP layer stateless and the backend independently scalable.
End-to-End Path
IN
Cursor decides tool based on server instructions + question context
MCP
stdio JSON-RPC → tool call reaches server.py handler
HTTP
httpx POST → FastAPI backend runs RAG / Sentinel / Veritas pipeline
OUT
r.text (JSON or markdown) returned over stdio back to Cursor
Stack
FastMCP
stdio JSON-RPC
httpx
Python
Cursor