Building a Simple Streamlit AI Chatbot with OpenRouter Auto-Routing
- Ken Munson
- Jun 7
- 6 min read
Executive Summary
In this mini-project, I built a small Streamlit-based AI assistant (an AI Chat Bot) and integrated it with OpenRouter so the application could dynamically route prompts across many large language models.
The app itself is intentionally simple:
A local Streamlit web interface
A text input box for prompts
An OpenRouter-backed model call
A cost/quality slider (to indicate what kind of model OpenRouter should choose)
Copy buttons for both the user question and model response
Visibility into which model OpenRouter actually selected (printed on screen at the end of each response).

The real value of the project was not the UI. The value was understanding how easy it is to move from a single-model API call to a multi-model routing architecture.
Instead of hardcoding one model provider and one model, the app now uses OpenRouter’s openrouter/auto router and lets OpenRouter decide which model to use based on the prompt and a configurable cost/quality tradeoff (the slider).
Phase 1: Starting with a Direct OpenAI API Call
The first version of the app was very basic.
It used:
Python
Streamlit
The OpenAI Python SDK
A local environment variable for the API key
A hardcoded model value
The basic flow was:
Streamlit UI → OpenAI SDK → OpenAI API → Model responseThe initial application worked, but I quickly ran into a few useful learning moments.
The first was API key management. The app was reading OPENAI_API_KEY from the local environment, but that key did not match the keys visible in the OpenAI API platform projects. This caused quota-related errors even though the account appeared to have credits available.
That debugging step reinforced an important operational point:
The API key determines the project, billing context, and access path used by the application.Once I created a clean API key in a dedicated project and made the environment variable persistent, the app worked correctly.
I seem to spend half my life dealing with API keys! There is some valuable scar tissue there however.
Phase 2: Verifying the Actual Model Used
The next lesson was model identity.
I initially asked the app:
What model am I talking to?The model gave inconsistent answers, at one point claiming to be GPT-3.5 and later GPT-4.1.
That was a useful reminder:
A model’s self-description is not a reliable way to identify the model.The reliable source is the API response metadata.
So I updated the app to display:
response.modelThat gave direct visibility into the actual model returned by the API. This became especially important once OpenRouter was added, because OpenRouter may route openrouter/auto to different underlying models depending on the prompt and settings.
Phase 3: Moving from OpenAI Direct to OpenRouter
The OpenRouter integration went surprisingly well.
Instead of using the OpenAI SDK against OpenAI’s default endpoint, the app was updated to use OpenRouter’s OpenAI-compatible endpoint:
client = OpenAI(
base_url="https://openrouter.ai/api/v1",
api_key=os.getenv("OPENROUTER_API_KEY"),
)The model was changed from a specific OpenAI model to:
MODEL = "openrouter/auto"That changed the architecture from:
Streamlit UI → OpenAI SDK → OpenAI modelto:
Streamlit UI → OpenAI SDK → OpenRouter endpoint → Auto-selected modelThis is the key architectural shift.
The application no longer needs to know in advance whether the best model for a prompt should come from OpenAI, DeepSeek, Anthropic, Google, or another provider. It simply sends the prompt to OpenRouter and receives both the answer and the actual model selected.
Phase 4: Adding a Cost/Quality Tradeoff Slider
This is pretty cool I have to say. After the OpenRouter call was working, I added a Streamlit slider for the OpenRouter auto-router cost/quality tradeoff.
The slider runs from:
0 = favor highest quality (generally more expensive tokens)
10 = favor lowest costIn the UI, this became a simple control:
OpenRouter cost/quality tradeoff: [0 -------- 10]This turned the app into a small routing experiment.
With the slider set around 7, OpenRouter selected a stronger model for even simple prompts. With the slider moved to 9, OpenRouter selected a cheaper/faster model for straightforward questions.
That was the first moment the app became more than a chatbot wrapper.
It became a way to observe model routing behavior.
This is my attempt at controlling token cost now that the foundation labs have decided to IPO, which means they need to start being profitable. Of note, it has been my experience sending the same query to different models that the questions I ask, and see other people asking, don't really require the latest and greatest model. Admittedly I don't have a 100 agents refactoring some 200 file repo but that is the beauty of OpenRouter - choose the level of model you need for the task at hand.
Phase 5: Adding Copy Buttons and a Scrollable Response Box
The next usability improvement was adding copy buttons.
I wanted the app to feel a little more like a modern chat interface, so I added:
A copy button for the user’s question
A copy button for the assistant’s response
A scrollable response area for long answers
This required a small custom HTML component inside Streamlit.
A first implementation mostly worked, but long model outputs exposed a display issue: the response appeared to be too short. Copying the response into Notepad revealed that the full answer was actually present. The custom display box was simply clipping the visible content.
The fix was to make the response area scroll internally:
max-height: 520px;
overflow-y: auto;That kept the page compact while still preserving the full response for reading and copying.
The final UI included:
Prompt input
Ask button
Copyable question box
Copyable assistant response box
Cost/quality slider
Actual selected modelFinal Architecture
The final local architecture looks like this:
User
↓
Streamlit UI
↓
OpenAI Python SDK
↓
OpenRouter API endpoint
↓
OpenRouter auto-router
↓
Selected model provider
↓
Model response
↓
Streamlit display + copy controlsThe application remains intentionally lightweight, but the model path is now flexible.
Instead of writing separate integrations for different providers, the app can use one OpenAI-compatible interface and let OpenRouter handle model selection.
What I Learned
1. API Keys Are Part of the Architecture (and a necessary evil).
This project started with a simple quota error, but the root cause was an environment variable pointing to the wrong key.
That reinforced a practical lesson:
Local environment variables are infrastructure.If the wrong key is loaded, the application may run under the wrong project, wrong billing context, or wrong provider account.
Side bar: Concentrate on API hygiene. If you can avoid even displaying the key on a terminal, that is best (easy to do - just ask you LLM). Obviously you don't copy it into your source code - you use an env variable for that. And the number of API keys on GitHub is ridiculous so, may sure you key doesn't get copied up to GitHub. Ask you LLM to expand on methods to avoid this!!!
2. Never Ask the Model What Model It Is
The model’s answer to “what model are you?” was unreliable.
The correct approach was to inspect the API response metadata.
For OpenRouter, this became especially useful because the requested model was:
openrouter/autobut the actual model selected could vary by prompt and routing settings.
3. OpenRouter Makes Model Experimentation Easier
The most interesting part of this project was how little code changed when moving from OpenAI direct to OpenRouter.
The main changes were:
base_url="https://openrouter.ai/api/v1"and:
MODEL = "openrouter/auto"That small change opened access to a multi-model routing layer.
4. Cost Control Needs to Be Visible
The cost/quality slider made the routing behavior tangible.
Instead of abstractly saying “use cheaper models when possible,” the app exposed the routing preference directly in the UI.
That made it possible to test the same prompt at different cost/quality settings and observe which model OpenRouter selected.
5. UI Details Matter Even in Small Tools
The copy buttons and scrollable response box were small additions, but they made the app much more usable.
For a local AI workbench, the ability to quickly copy both the prompt and response matters. It supports testing, documentation, reuse, and comparison across models.
Why This Matters
This was not a complex application.
That is partly the point.
In a very small amount of code, the app demonstrates several important AI engineering concepts:
API key isolation
Environment variable management
Provider abstraction
OpenAI-compatible SDK reuse
Dynamic model routing
Cost/quality tradeoff control
Response metadata inspection
Lightweight local UI development
Usability improvements for prompt/response workflows
The app is simple, but the architecture points toward a larger pattern:
Do not bind every AI application directly to one model.
Build a routing layer into the design.That routing layer may be OpenRouter, a custom router, an internal enterprise gateway, or a policy-controlled model broker. The core idea is the same: separate the application experience from the model selection decision.
Final Reflection
The most valuable part of this mini-project was seeing how quickly a basic Streamlit app could become a model-routing workbench.
The UI stayed simple.
The code stayed readable.
But the application moved from:
Call this one modelto:
Route this prompt to an appropriate model based on cost and qualityThat is a meaningful shift.
For small experiments, this makes model comparison easier. For larger systems, the same pattern becomes part of cost governance, reliability engineering, and AI platform design.
This project was a reminder that sometimes the most useful learning labs are not large systems. Sometimes they are small tools that make the invisible architecture visible.
GitHub Link:
Code for this mini-project is available on GitHub:




Comments