Building LLM-Powered Applications with Shiny for Python: Practical Insights

Reading time:

time

min

Piotr Pasza Storożenko

February 7, 2025

At Appsilon, we've been integrating Large Language Models into Shiny for Python applications for a while now. One thing became clear: the challenge isn't in the initial integration. Shiny for Python's ui.Chat component makes that straightforward. The real complexity lies in building applications that can evolve with your needs.

Python meets Shiny: what can you build? Find out in our ultimate guide, packed with insights and hands-on tips.

We've seen projects start with simple text completion, grow to include structured outputs, and eventually handle image processing. Each iteration brought new technical decisions: Use a unified LLM interface like LangChain, or work directly with OpenAI and Claude APIs? Streaming or non-streaming responses?

This post shares key architectural insights from our journey. You'll learn how to structure your project to make future iterations possible and efficient. We'll focus on practical decisions that impact maintainability and flexibility, drawing from our experience building production applications.

Let's dive in!

Smart Architectural Choices

The key to maintaining velocity in LLM projects is the separation of concerns. Your chat UI shouldn't need to know whether you're using GPT-4o, Claude, or a local model.

Here's what works for us:

Separate LLM provider logic - Create a dedicated LLM handler class that encapsulates all provider-specific code. We learned this the hard way - when we needed to switch from LangChain to raw OpenAI for image processing, having tightly coupled provider code meant touching multiple parts of the application. A clean separation makes these transitions manageable.
Enable testing without APIs - Create a mock LLM handler class that implements the same interface as your real handler. This lets you test your application's UI behavior without making expensive API calls. If your app uses streaming responses, mock those too - it's crucial to test your application under the same conditions it will run in production.
Structured project layout - We use our open-source Tapyr template for Shiny Python applications. It provides a battle-tested structure that naturally supports these separations, handles environment setup, and sets up logging. This becomes crucial when your chat application grows to include features like conversation history or document processing.

Enterprise-ready Shiny for Python dashboards? Learn how Tapyr makes building and deploying them easier than ever.

‍This might seem like overengineering for a simple chat application. However, we've found that LLM projects rarely stay simple. What starts as basic text completion often evolves into multi-modal interactions with structured outputs. Good architecture makes these transitions smooth without much overhead.

‍Table Comparing the Functionality of Different Ways of Interacting With LLMs in Python‍

Functionality	LangChain	OpenAI API (Completion)	OpenAI API (Assistants)	Anthropic Claude
Using Structured Output	Excellent structured output, works well with streaming	Basic Pydantic support, less robust than LangChain	No Pydantic support/response format	Basic JSON-specified response structure
Response format with Pydantic	Works well with Pydantic	Basic Pydantic support	No Pydantic support	No Pydantic support
Image Attachments	No support for sending images to API	Supports image analysis well	Supports image analysis well	Supports image analysis well
Adding Files	No capability	No capability	Supports uploading files, model can refer to them	No capability
Streaming Response	Excellent UX for structured streaming responses	Supports streaming responses	Supports streaming responses	Supports streaming responses

When we mention "excellent UX for structured streaming responses," we're talking about the ability to show partial results to users in real-time, while maintaining a structured format. With LangChain, you can see responses building token by token - imagine watching a table populate gradually or JSON fields growing incrementally.

While OpenAI's structured output API also streams responses, it works at the field level - you'll see complete Pydantic fields appear one at a time. Both approaches avoid making users wait for the complete response, but LangChain's token-by-token streaming often feels more fluid, especially for elements like code blocks where seeing the gradual construction can be more engaging.

Cloud LLMs come with trade-offs. See what challenges we faced and how we addressed them in our latest case study.

‍Real-world Integration NotesGetting started with LLMs in Shiny for Python is straightforward. The ui.Chat component provides everything you need for basic chat functionality. But as with most LLM projects, you quickly want to extend it in unexpected ways.A typical journey looks like this: You begin with basic text interactions.

Then, you want structured responses to better integrate with your application logic. LangChain with Pydantic makes this easy. But as your application grows, you might need to handle images, only to discover LangChain doesn't support them yet. So you might need to switch to OpenAI's native structured output for better control, performance, and set of custom features. Each step brings new integration challenges.

Our experience taught us when to prototype. For image handling, we knew upfront we'd need to switch from LangChain to raw OpenAI calls, so creating a PoC was an obvious choice. What caught us off guard was transitioning between structured output implementations.

Since both LangChain and OpenAI use Pydantic for validation, we assumed the switch would be straightforward. Instead, we discovered subtle differences - like OpenAI's sensitivity to field descriptions in Pydantic models.

This taught us a valuable lesson: create isolated proofs of concept even when tools seem perfectly compatible.

‍The most valuable lesson? It's not about choosing the perfect tools upfront. It's about building your application so it can adapt as requirements evolve. Whether you're using LangChain's abstractions or raw provider APIs, good architecture makes these transitions manageable.

Lessons Learned

Building LLM-powered applications is an iterative process.

Here are our key takeaways:

Create small proofs of concept - not just for new features, but also when switching between seemingly compatible tools. This habit saves significant refactoring time later.
Design your code for change. A clear separation between LLM logic and application code isn't overengineering - it's preparation for inevitable evolution.
When starting a new Shiny for Python project with LLMs, consider using our open-source Tapyr template. It provides the structure needed to maintain clean separation of concerns as your application grows.

Want to discuss your Shiny for Python project? Reach out to Appsilon. We're here to help you build robust, scalable LLM applications.

Have questions or insights?

Engage with experts, share ideas and take your data journey to the next level!

Is Your Software GxP Compliant?

Download a checklist designed for clinical managers in data departments to make sure that software meets requirements for FDA and EMA submissions.

Get the Checklist

Ensure Your R and Python Code Meets FDA and EMA Standards

A comprehensive diagnosis of your R and Python software and computing environment compliance with actionable recommendations and areas for improvement.

Book the Audit

Building LLM-Powered Applications with Shiny for Python: Practical Insights

Smart Architectural Choices

‍Table Comparing the Functionality of Different Ways of Interacting With LLMs in Python‍

Lessons Learned

Have questions or insights?

Is Your Software GxP Compliant?

Ensure Your R and Python Code Meets FDA and EMA Standards

LLM + Quarto: Turn One-Off Reports Into Automated Solutions

Building LLM-Powered Applications with Shiny for Python: Practical Insights

Scaling Decision Support Systems: When to Use React, Python, and R

Share Your Data Goals with Us

Building LLM-Powered Applications with Shiny for Python: Practical Insights

Smart Architectural Choices

‍Table Comparing the Functionality of Different Ways of Interacting With LLMs in Python‍

Lessons Learned

Have questions or insights?

Is Your Software GxP Compliant?

Ensure Your R and Python Code Meets FDA and EMA Standards

Read about similar topics

LLM + Quarto: Turn One-Off Reports Into Automated Solutions

Building LLM-Powered Applications with Shiny for Python: Practical Insights

Scaling Decision Support Systems: When to Use React, Python, and R

Share Your Data Goals with Us