<?xml version="1.0" encoding="UTF-8"?><rss xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:atom="http://www.w3.org/2005/Atom" version="2.0" xmlns:itunes="http://www.itunes.com/dtds/podcast-1.0.dtd" xmlns:googleplay="http://www.google.com/schemas/play-podcasts/1.0"><channel><title><![CDATA[Full Stack ML]]></title><description><![CDATA[From full stack developer to machine learning practitioner]]></description><link>https://www.fullstackml.dev</link><image><url>https://substackcdn.com/image/fetch/$s_!Xqe4!,w_256,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd29b1909-71a3-4f21-a045-2cf50f9b0db2_1024x1024.png</url><title>Full Stack ML</title><link>https://www.fullstackml.dev</link></image><generator>Substack</generator><lastBuildDate>Tue, 21 Apr 2026 11:18:41 GMT</lastBuildDate><atom:link href="https://www.fullstackml.dev/feed" rel="self" type="application/rss+xml"/><copyright><![CDATA[JO Reyes]]></copyright><language><![CDATA[en]]></language><webMaster><![CDATA[joreyes@substack.com]]></webMaster><itunes:owner><itunes:email><![CDATA[joreyes@substack.com]]></itunes:email><itunes:name><![CDATA[JO Reyes]]></itunes:name></itunes:owner><itunes:author><![CDATA[JO Reyes]]></itunes:author><googleplay:owner><![CDATA[joreyes@substack.com]]></googleplay:owner><googleplay:email><![CDATA[joreyes@substack.com]]></googleplay:email><googleplay:author><![CDATA[JO Reyes]]></googleplay:author><itunes:block><![CDATA[Yes]]></itunes:block><item><title><![CDATA[#15 - Where Does the Time Go? Measuring Voice-to-Voice Latency with Pipecat, Modal and Twilio]]></title><description><![CDATA[We recently hit a one-second latency milestone with our inbound call application - but what&#8217;s actually happening under the hood during that second?]]></description><link>https://www.fullstackml.dev/p/15-where-does-the-time-go-measuring</link><guid isPermaLink="false">https://www.fullstackml.dev/p/15-where-does-the-time-go-measuring</guid><dc:creator><![CDATA[JO Reyes]]></dc:creator><pubDate>Sat, 21 Feb 2026 00:13:50 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!_v31!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F66c4b4ba-1e18-4366-80b1-4a9d86e69ff1_1024x1024.heic" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>After building the <a href="https://github.com/jaeyow/twilio-chatbot/tree/main/inbound">inbound</a> and <a href="https://github.com/jaeyow/twilio-chatbot/tree/main/outbound">outbound</a> voice bots, I kept coming back to the same question: <em>how fast is this thing, really?</em></p><p>I could feel it during calls. You stop talking, there&#8217;s a short pause, and then the bot responds. It felt like maybe a second. Not bad - definitely fast enough to have a conversation. But &#8220;it feels like about a second&#8221; isn&#8217;t a measurement. And if I wanted to make it faster, I needed to know where that time was actually going.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!_v31!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F66c4b4ba-1e18-4366-80b1-4a9d86e69ff1_1024x1024.heic" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!_v31!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F66c4b4ba-1e18-4366-80b1-4a9d86e69ff1_1024x1024.heic 424w, https://substackcdn.com/image/fetch/$s_!_v31!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F66c4b4ba-1e18-4366-80b1-4a9d86e69ff1_1024x1024.heic 848w, https://substackcdn.com/image/fetch/$s_!_v31!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F66c4b4ba-1e18-4366-80b1-4a9d86e69ff1_1024x1024.heic 1272w, https://substackcdn.com/image/fetch/$s_!_v31!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F66c4b4ba-1e18-4366-80b1-4a9d86e69ff1_1024x1024.heic 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!_v31!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F66c4b4ba-1e18-4366-80b1-4a9d86e69ff1_1024x1024.heic" width="728" height="728" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/66c4b4ba-1e18-4366-80b1-4a9d86e69ff1_1024x1024.heic&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:false,&quot;imageSize&quot;:&quot;normal&quot;,&quot;height&quot;:1024,&quot;width&quot;:1024,&quot;resizeWidth&quot;:728,&quot;bytes&quot;:205031,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/heic&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://fullstackml.dev/i/188222934?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F66c4b4ba-1e18-4366-80b1-4a9d86e69ff1_1024x1024.heic&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:&quot;center&quot;,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!_v31!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F66c4b4ba-1e18-4366-80b1-4a9d86e69ff1_1024x1024.heic 424w, https://substackcdn.com/image/fetch/$s_!_v31!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F66c4b4ba-1e18-4366-80b1-4a9d86e69ff1_1024x1024.heic 848w, https://substackcdn.com/image/fetch/$s_!_v31!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F66c4b4ba-1e18-4366-80b1-4a9d86e69ff1_1024x1024.heic 1272w, https://substackcdn.com/image/fetch/$s_!_v31!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F66c4b4ba-1e18-4366-80b1-4a9d86e69ff1_1024x1024.heic 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Is it the speech recognition? The LLM thinking? The text-to-speech? The network round-trips between all these services? Without numbers, I&#8217;d just be guessing.</p><p>Turns out, the answer was hiding in plain sight. Pipecat had been tracking this data all along - I just wasn&#8217;t listening.</p><h2><strong>The Data Was Already There</strong></h2><p>Here&#8217;s something that surprised me. In my bot code, I&#8217;d had these two flags set from the very beginning:</p><pre><code><code>task = PipelineTask(
    pipeline,
    params=PipelineParams(
        audio_in_sample_rate=8000,
        audio_out_sample_rate=8000,
        enable_metrics=True,
        enable_usage_metrics=True,
    ),
)</code></code></pre><p>See <code>enable_metrics=True</code> and <code>enable_usage_metrics=True</code>? Those tell <a href="https://docs.pipecat.ai/">Pipecat</a> to measure the performance of every service in the pipeline - how long Deepgram takes to return the first transcription, how long Groq takes to start generating tokens, how long the TTS takes to produce audio. Every time audio flows through the pipeline, Pipecat wraps these measurements into something called a <code>MetricsFrame</code> and sends it downstream.</p><p>But here&#8217;s the thing: nobody was listening. The metrics frames were flowing through the pipeline and getting silently discarded at the end. It&#8217;s like having a speedometer in your car that works perfectly - you just never look at it.</p><p>What I needed was an <strong>observer</strong>.</p><h2><strong>Pipecat&#8217;s Observer Pattern</strong></h2><p>Pipecat has a clean concept for this: <strong>observers</strong>. An observer is a side-channel listener that sees every frame flowing between processors in the pipeline, but it&#8217;s not <em>in</em> the pipeline. It can&#8217;t block anything or slow things down. It just watches.</p><p>Think of it like a security camera in a factory. The assembly line keeps running exactly the same whether the cameras are on or off. But when they&#8217;re on, you can see exactly what&#8217;s happening at every step.</p><p>Pipecat ships with several built-in observers, and two of them were exactly what I needed:</p><ul><li><p><code>MetricsLogObserver</code> - Logs every metrics frame to the console: TTFB (time to first byte) for each service, token usage, TTS character counts, Smart Turn decision metrics</p></li><li><p><code>UserBotLatencyLogObserver</code> - Measures the wall-clock time from when the user stops speaking to when the bot starts speaking, and gives you a summary at the end of the call</p></li></ul><p>Adding them took three lines of code.</p><h2><strong>Three Lines to Instant Visibility</strong></h2><p>Here&#8217;s the change. I added the observer imports and passed them to the <code>PipelineTask</code>:</p><pre><code><code>from pipecat.observers.loggers.metrics_log_observer import MetricsLogObserver
from pipecat.observers.loggers.user_bot_latency_log_observer import UserBotLatencyLogObserver

# ... existing pipeline setup ...

task = PipelineTask(
    pipeline,
    params=PipelineParams(
        audio_in_sample_rate=8000,
        audio_out_sample_rate=8000,
        enable_metrics=True,
        enable_usage_metrics=True,
    ),
    observers=[
        MetricsLogObserver(),
        UserBotLatencyLogObserver(),
    ],
)</code></code></pre><p>That&#8217;s it. No other changes. The pipeline runs exactly the same as before - the observers are purely additive.</p><p>Now when I made a test call, the logs lit up with data I&#8217;d never seen before:</p><pre><code><code>DEBUG | &#128202; [DeepgramSTTService#0] TTFB (nova-3): 0.200s at 14.231s
DEBUG | &#128202; [LocalSmartTurnAnalyzerV3#0] SMART TURN: COMPLETE (probability: 95.10%, inference: 11.2ms, server: 14.8ms, e2e: 91.3ms) at 14.322s
DEBUG | &#128202; [GroqLLMService#0] TTFB (llama-3.3-70b-versatile): 0.268s at 14.590s
DEBUG | &#128202; [GroqLLMService#0] LLM TOKEN USAGE (llama-3.3-70b-versatile): prompt: 290, completion: 17, total: 307 at 14.95s
DEBUG | &#128202; [DeepgramTTSService#0] TTFB (aura-2-theia-en): 0.044s at 14.634s
DEBUG | &#128202; [DeepgramTTSService#0] TTS USAGE (aura-2-theia-en): 32 characters at 14.95s
DEBUG | &#9201;&#65039; LATENCY FROM USER STOPPED SPEAKING TO BOT STARTED SPEAKING: 0.976s
</code></code></pre><p>Let me break down what each line means:</p><ul><li><p><strong>STT TTFB: 0.200s</strong> - Deepgram took 200 milliseconds to return the first word of the transcription after receiving audio</p></li><li><p><strong>Smart Turn: 91.3ms e2e</strong> - The turn detection model took 91ms to decide (with 95.1% confidence) that I was actually done speaking, not just pausing</p></li><li><p><strong>LLM TTFB: 0.268s</strong> - Groq took 268 milliseconds to start generating the first token of the response</p></li><li><p><strong>LLM Token Usage: 290 prompt / 17 completion</strong> - The context size and response length for this turn</p></li><li><p><strong>TTS TTFB: 0.044s</strong> - Deepgram&#8217;s TTS took 44 milliseconds to start producing audio from the first text it received</p></li><li><p><strong>Total latency: 0.976s</strong> - The wall-clock time from when I stopped talking to when I heard the bot start responding</p></li></ul><p>At the end of the call, <code>UserBotLatencyLogObserver</code> printed a summary:</p><pre><code><code>INFO | &#9201;&#65039; LATENCY FROM USER STOPPED SPEAKING TO BOT STARTED SPEAKING - Avg: 1.048s, Min: 0.976s, Max: 1.159s
</code></code></pre><p>This was already revealing. But I wanted more detail - specifically, I wanted to see the breakdown <em>per turn</em> in a single table, so I could spot patterns across a conversation.</p><h2><strong>Building a Custom Observer</strong></h2><p>The built-in observers are great for a quick look, but their log lines aren&#8217;t correlated per turn. You can&#8217;t easily see &#8220;for turn 3, what was the STT latency vs. the LLM latency?&#8221; To get that, I built a custom observer called <code>LatencyBreakdownObserver</code>.</p><p>Building a custom observer is straightforward. You subclass <code>BaseObserver</code> and implement <code>on_push_frame</code>, which gets called every time a frame passes between two processors in the pipeline:</p><pre><code><code>from pipecat.observers.base_observer import BaseObserver, FramePushed

class LatencyBreakdownObserver(BaseObserver):
    async def on_push_frame(self, data: FramePushed) -&gt; None:
        # data.frame is the frame being passed
        # data.source is the processor sending it
        # data.destination is the processor receiving it
        # data.direction is DOWNSTREAM or UPSTREAM
        ...
</code></code></pre><p>The key insight is that certain frame types mark the boundaries of a conversation turn:</p><ul><li><p><code>VADUserStoppedSpeakingFrame</code> - The user finished talking. Start the clock.</p></li><li><p><code>MetricsFrame</code> - Contains TTFB measurements from each service. Capture them.</p></li><li><p><code>BotStartedSpeakingFrame</code> - The bot started talking. Stop the clock.</p></li><li><p><code>EndFrame</code> / <code>CancelFrame</code> - The call is over. Print the summary.</p></li></ul><p>I created a simple dataclass to hold per-turn data:</p><pre><code><code>@dataclass
class TurnLatency:
    turn_number: int
    stt_ttfb: Optional[float] = None          # seconds
    smart_turn_e2e_ms: Optional[float] = None  # milliseconds
    llm_ttfb: Optional[float] = None           # seconds
    tts_ttfb: Optional[float] = None           # seconds
    total_wall_clock: Optional[float] = None   # seconds
    llm_prompt_tokens: Optional[int] = None
    llm_completion_tokens: Optional[int] = None
    tts_characters: Optional[int] = None
</code></code></pre><p>The observer collects one <code>TurnLatency</code> per conversation turn. When it sees a <code>VADUserStoppedSpeakingFrame</code>, it starts a new turn and records the wall-clock time. As <code>MetricsFrame</code> objects arrive, it routes each measurement to the right field by checking the processor name (e.g., <code>"DeepgramSTTService"</code> for STT, <code>"GroqLLMService"</code> for the LLM). When <code>BotStartedSpeakingFrame</code> arrives, it stops the clock and saves the turn.</p><p>The full implementation is in <a href="https://github.com/jaeyow/twilio-chatbot/blob/main/latency/observers.py">observers.py</a> - it&#8217;s about 100 lines of focused code.</p><p>I added it alongside the built-in observers:</p><pre><code><code>from observers import LatencyBreakdownObserver

task = PipelineTask(
    pipeline,
    params=PipelineParams(
        audio_in_sample_rate=8000,
        audio_out_sample_rate=8000,
        enable_metrics=True,
        enable_usage_metrics=True,
    ),
    observers=[
        MetricsLogObserver(),
        UserBotLatencyLogObserver(),
        LatencyBreakdownObserver(),
    ],
)</code></code></pre><h2><strong>The Numbers: Where Does 1 Second Go?</strong></h2><p>Here&#8217;s what the summary table looks like after a real call. This is the payoff - the table prints to the logs when the call ends:</p><pre><code><code>=== LATENCY BREAKDOWN (4 turns) ===
Turn | Total  | STT TTFB | Smart Turn | LLM TTFB | TTS TTFB | LLM Tokens | TTS Chars
-----+--------+----------+------------+----------+----------+------------+----------
   2 | 0.976s |   0.200s |       91ms |   0.268s |   0.044s |     290/17 |       32
   3 | 1.001s |   0.266s |       82ms |   0.268s |   0.029s |     323/33 |      111
   4 | 1.057s |   0.155s |       84ms |   0.268s |   0.013s |     388/84 |      228
   6 | 1.159s |   0.157s |       81ms |   0.271s |   0.006s |     504/33 |       13
-----+--------+----------+------------+----------+----------+------------+----------
 Avg | 1.048s |   0.195s |       84ms |   0.268s |   0.023s |            |</code></code></pre><p>A few things jump out immediately.</p><p><strong>The LLM is the slowest component.</strong> At 268ms average TTFB, the LLM takes more time than any other service. This is still a 70B parameter model running on Groq, but it&#8217;s notably slower than the sub-100ms performance you might expect. This could be due to API queuing, model load, or the specific context size.</p><p><strong>The AI services combined take about 570ms.</strong> Add up the averages: STT (195ms) + Smart Turn (84ms) + LLM (268ms) + TTS (23ms) = roughly 570ms. That&#8217;s the time these services spend doing their thing.</p><p><strong>So where&#8217;s the other 480ms?</strong> The total averages 1.048 seconds, but only ~570ms is accounted for by service TTFB. The rest is pipeline overhead: network round-trips between your server and each external API (Twilio to Modal, Modal to Deepgram, Modal to Groq, Modal to Deepgram again), audio buffering, frame serialisation, and the VAD&#8217;s own silence detection window before <code>VADUserStoppedSpeakingFrame</code> even fires.</p><p><strong>Turn 4 generated the longest response.</strong> With 84 completion tokens and 228 TTS characters, it had the most content to generate. Interestingly, the LLM TTFB stayed constant at 268ms regardless of response length - time to first token doesn&#8217;t depend on how many tokens will eventually be generated.</p><p><strong>Context grows, but TTFB stays remarkably flat.</strong> Notice how the prompt token count grows from 290 to 504 across the conversation (each turn adds to the context), but LLM TTFB barely changes (268ms to 271ms). Despite doubling the context size, inference latency remains essentially constant.</p><h2><strong>What Would Actually Make It Faster?</strong></h2><p>Now that we have numbers, we can think about optimisation rationally instead of guessing.</p><p><strong>The LLM is the biggest opportunity.</strong> At 268ms average TTFB, the LLM accounts for over 25% of the total latency. This could potentially be improved by using a smaller/faster model, ensuring optimal Groq API configuration, or exploring self-hosted LLM inference with a model optimized for low latency. The question is whether faster inference would compromise response quality.</p><p><strong>Network round-trips and pipeline overhead matter.</strong> About 480ms (nearly half the total latency) is spent on overhead: network round-trips between your server and each external API (Twilio to Modal, Modal to Deepgram, Modal to Groq, Modal to Deepgram again), audio buffering, frame serialisation, and the VAD&#8217;s own silence detection window before <code>VADUserStoppedSpeakingFrame</code> even fires. Self-hosting these services on the same infrastructure could eliminate many of these round-trips. The location of each component, for example the region where these components are deployed to will also have a bearing on the latency. </p><p><strong>Smart Turn is surprisingly fast.</strong> At just 84ms average, the turn detection is much faster than expected and not a bottleneck. The VAD&#8217;s <code>stop_secs</code> parameter (likely set to around 0.3-0.5s) adds some additional latency, but further reduction would risk more false positives.</p><p><strong>STT and TTS are well-optimized.</strong> Deepgram STT at 195ms and TTS at 23ms are both performing well. There&#8217;s not much to gain here without sacrificing quality.</p><h2><strong>Try It Yourself</strong></h2><p><strong>All the code for this - the observers, the bot changes, everything - is in the </strong><code>latency/</code><strong><a href="https://github.com/jaeyow/twilio-deepgram-voice/tree/main/latency"> directory</a> of the GitHub repo</strong>.</p><p>It&#8217;s a self-contained version of the inbound bot with the observers wired up. Deploy it with:</p><pre><code><code>cd twilio-chatbot/latency
cp env.example .env
# Fill in your API keys in .env
modal serve modal_app.py</code></code></pre><p>Make a call, and you&#8217;ll see the metrics flowing in your logs. The observers work identically with the outbound bot too - just add the same three imports and the <code>observers</code> parameter to the <code>PipelineTask</code>.</p><p>What I find most interesting about these numbers is what they suggest about the <em>next</em> step. The API services are fast, but the network overhead is significant. What if you could run your own STT, LLM, and TTS on the same GPU, eliminating the network hops entirely? Pipecat already supports self-hosted alternatives like Faster-Whisper for STT, vLLM or Ollama for the LLM, and XTTS for TTS. The question is: would self-hosting actually be faster - and at what cost?</p><p>That&#8217;s exactly what I&#8217;m going to find out next.</p>]]></content:encoded></item><item><title><![CDATA[#14 - When the AI Calls You: Building an Outbound Voice Bot with Pipecat, Twilio, Deepgram and Modal]]></title><description><![CDATA[Building a full-featured outbound Voice AI application (using a Filipino English accent) with a $0 budget]]></description><link>https://www.fullstackml.dev/p/14-when-the-ai-calls-you-building</link><guid isPermaLink="false">https://www.fullstackml.dev/p/14-when-the-ai-calls-you-building</guid><dc:creator><![CDATA[JO Reyes]]></dc:creator><pubDate>Fri, 13 Feb 2026 15:12:19 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!d1tW!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffa92e383-7193-4c09-aacf-c237ef867bda_1696x608.heic" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>In the <a href="https://fullstackml.dev/p/voice-ai-over-the-phone-a-practical">last article</a>, I built a voice AI that answers incoming phone calls. You call a number, an AI named Miss Harper picks up, and you have a real conversation. It was surprisingly straightforward to build, and I was pretty happy with it.</p><p>But the moment it worked, a different question started bugging me: <em>what if the bot could call someone else?</em></p><p>Think about it. An inbound bot waits for people to call. That&#8217;s useful - it&#8217;s basically an AI receptionist. But an outbound bot? That opens up a whole different set of possibilities. Appointment reminders that actually talk to you. Customer check-in calls. Survey bots. Automated outreach. Even something as simple as a daily wake-up call from a friendly AI voice.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!d1tW!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffa92e383-7193-4c09-aacf-c237ef867bda_1696x608.heic" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!d1tW!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffa92e383-7193-4c09-aacf-c237ef867bda_1696x608.heic 424w, https://substackcdn.com/image/fetch/$s_!d1tW!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffa92e383-7193-4c09-aacf-c237ef867bda_1696x608.heic 848w, https://substackcdn.com/image/fetch/$s_!d1tW!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffa92e383-7193-4c09-aacf-c237ef867bda_1696x608.heic 1272w, https://substackcdn.com/image/fetch/$s_!d1tW!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffa92e383-7193-4c09-aacf-c237ef867bda_1696x608.heic 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!d1tW!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffa92e383-7193-4c09-aacf-c237ef867bda_1696x608.heic" width="1456" height="522" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/fa92e383-7193-4c09-aacf-c237ef867bda_1696x608.heic&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:522,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:215437,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/heic&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://fullstackml.dev/i/187380354?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffa92e383-7193-4c09-aacf-c237ef867bda_1696x608.heic&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!d1tW!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffa92e383-7193-4c09-aacf-c237ef867bda_1696x608.heic 424w, https://substackcdn.com/image/fetch/$s_!d1tW!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffa92e383-7193-4c09-aacf-c237ef867bda_1696x608.heic 848w, https://substackcdn.com/image/fetch/$s_!d1tW!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffa92e383-7193-4c09-aacf-c237ef867bda_1696x608.heic 1272w, https://substackcdn.com/image/fetch/$s_!d1tW!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffa92e383-7193-4c09-aacf-c237ef867bda_1696x608.heic 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>The idea seemed harder at first. With inbound, the caller starts things - Twilio just hands the call to your bot. With outbound, <em>you</em> have to start the call. Your code has to tell Twilio &#8220;dial this number,&#8221; wait for the person to pick up, and <em>then</em> connect the AI. It&#8217;s a different dance.</p><p>But it turns out the core of it - the AI pipeline that listens, thinks, and talks back - is exactly the same. The only difference is the plumbing around it. And once you understand that plumbing, it clicks.</p><p>Here&#8217;s a recording of the outbound bot in action - it dials a number, the person picks up, and the conversation flows naturally:</p><div class="native-audio-embed" data-component-name="AudioPlaceholder" data-attrs="{&quot;label&quot;:null,&quot;mediaUploadId&quot;:&quot;a819c0ce-b14a-40e9-9b2b-5dba5b3ad1a7&quot;,&quot;duration&quot;:65.69796,&quot;downloadable&quot;:true,&quot;isEditorNode&quot;:true}"></div><p>In this article, I&#8217;ll walk through how I built the outbound bot. If you haven&#8217;t read the first article on the inbound bot, I&#8217;d recommend starting there - I cover the basics of the tech stack (<a href="https://docs.pipecat.ai/">Pipecat</a>, <a href="https://groq.com/">Groq</a>, <a href="https://deepgram.com/">Deepgram</a>, and <a href="https://modal.com/">Modal</a>) in more detail. But this article should still make sense on its own.</p><h2><strong>Inbound vs. Outbound: What Actually Changes?</strong></h2><p>Here&#8217;s the good news: most of the code is the same. The AI pipeline - the part that listens to speech, converts it to text, generates a response, and speaks it back - is identical. Same <a href="https://docs.pipecat.ai/">Pipecat</a> pipeline. Same <a href="https://groq.com/">Groq</a> LLM. Same <a href="https://deepgram.com/">Deepgram</a> for speech recognition and text-to-speech. Same noise filtering. Same smart turn detection.</p><p>What changes is <strong>who starts the call</strong> and <strong>how the connection gets set up</strong>.</p><p>With the <strong>inbound</strong> bot:</p><ol><li><p>Someone calls your Twilio number</p></li><li><p>Twilio asks your server for instructions on what to do</p></li><li><p>Your server returns <a href="https://www.twilio.com/docs/glossary/what-is-twilio-markup-language-twiml">TwiML</a> that tells Twilio to &#8220;open a WebSocket and stream the audio&#8221;</p></li><li><p>The pipeline handles the conversation</p></li></ol><p>With the <strong>outbound</strong> bot:</p><ol><li><p><strong>You</strong> send a request to your own server: &#8220;call this number&#8221;</p></li><li><p>Your server tells Twilio: &#8220;dial +1234567890 and when they pick up, ask me what to do next&#8221;</p></li><li><p>The person answers their phone</p></li><li><p>Twilio asks your server for instructions on what to do</p></li><li><p>Your server returns TwiML that tells Twilio to &#8220;open a WebSocket and stream the audio&#8221;</p></li><li><p>The pipeline handles the conversation</p></li></ol><p>See the difference? There&#8217;s one extra step at the beginning - you kick things off instead of waiting. After the person picks up, everything is the same.</p><p>But that one extra step means our server needs an additional endpoint (a URL on your server that accepts requests) to handle it. Which brings us to the interesting part.</p><h2><strong>The Three-Endpoint Dance</strong></h2><p>The inbound bot needed just two endpoints:</p><ul><li><p><strong>POST /</strong> - Twilio hits this when someone calls. We respond with instructions.</p></li><li><p><strong>WebSocket /ws</strong> - Where the real-time audio streams back and forth.</p></li></ul><p>The outbound bot needs three:</p><h3><strong>1. POST /dialout - &#8220;Start a call&#8221;</strong></h3><p>This is where it all begins. You send a request to this endpoint with two pieces of information: the phone number you want to call and the <a href="https://www.twilio.com/">Twilio</a> phone number to call <em>from</em>.</p><pre><code><code>POST /dialout
{
  "to_number": "+15551234567",
  "from_number": "+15559876543"
}</code></code></pre><p>When this endpoint receives your request, it uses the <a href="https://www.twilio.com/docs/voice/api">Twilio REST API</a> to start a phone call. It tells Twilio: &#8220;Dial this number, and when the person picks up, go to my <code>/twiml</code> endpoint to get instructions.&#8221;</p><p>Twilio starts ringing the phone. You get back a response with a call SID so you can track it.</p><h3><strong>2. POST /twiml - &#8220;What to do when they answer&#8221;</strong></h3><p>When the person picks up, Twilio needs to know what to do with the call. So it reaches out to this endpoint. Our server responds with <a href="https://www.twilio.com/docs/voice/twiml">TwiML</a> (Twilio&#8217;s XML-based instruction language - think of it as a short instruction sheet that tells Twilio how to handle a call) that says: &#8220;Open a WebSocket connection to <code>/ws</code> and start streaming audio.&#8221;</p><p>There&#8217;s one clever detail here: we pass the phone numbers (who&#8217;s calling, who&#8217;s being called) as parameters in the WebSocket stream. This way, the bot knows who it&#8217;s talking to and can personalise the conversation.</p><h3><strong>3. WebSocket /ws - &#8220;The actual conversation&#8221;</strong></h3><p>This is the same WebSocket endpoint from the inbound bot. Once it&#8217;s open, audio flows in both directions through <a href="https://www.twilio.com/docs/voice/media-streams">Twilio Media Streams</a> and the Pipecat pipeline does its thing - listen, think, respond.</p><p>Here&#8217;s the whole flow as a diagram:</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!abuB!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0970ebb0-5443-4071-baf8-057e9b6b455b_7862x410.heic" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!abuB!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0970ebb0-5443-4071-baf8-057e9b6b455b_7862x410.heic 424w, https://substackcdn.com/image/fetch/$s_!abuB!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0970ebb0-5443-4071-baf8-057e9b6b455b_7862x410.heic 848w, https://substackcdn.com/image/fetch/$s_!abuB!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0970ebb0-5443-4071-baf8-057e9b6b455b_7862x410.heic 1272w, https://substackcdn.com/image/fetch/$s_!abuB!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0970ebb0-5443-4071-baf8-057e9b6b455b_7862x410.heic 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!abuB!,w_2400,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0970ebb0-5443-4071-baf8-057e9b6b455b_7862x410.heic" width="1200" height="62.637362637362635" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/0970ebb0-5443-4071-baf8-057e9b6b455b_7862x410.heic&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:false,&quot;imageSize&quot;:&quot;large&quot;,&quot;height&quot;:76,&quot;width&quot;:1456,&quot;resizeWidth&quot;:1200,&quot;bytes&quot;:61185,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/heic&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://fullstackml.dev/i/187380354?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0970ebb0-5443-4071-baf8-057e9b6b455b_7862x410.heic&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:&quot;center&quot;,&quot;offset&quot;:false}" class="sizing-large" alt="" srcset="https://substackcdn.com/image/fetch/$s_!abuB!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0970ebb0-5443-4071-baf8-057e9b6b455b_7862x410.heic 424w, https://substackcdn.com/image/fetch/$s_!abuB!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0970ebb0-5443-4071-baf8-057e9b6b455b_7862x410.heic 848w, https://substackcdn.com/image/fetch/$s_!abuB!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0970ebb0-5443-4071-baf8-057e9b6b455b_7862x410.heic 1272w, https://substackcdn.com/image/fetch/$s_!abuB!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0970ebb0-5443-4071-baf8-057e9b6b455b_7862x410.heic 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><p>The key insight: the three endpoints form a chain. You trigger the first one, and the rest happen automatically. From the person receiving the call, it just feels like their phone rang and someone started talking to them.</p><h2><strong>The Pipeline: Same Trusted Stack</strong></h2><p>The AI pipeline is nearly identical to the inbound version. Same building blocks:</p><ul><li><p><strong><a href="https://deepgram.com/">Deepgram</a> <a href="https://deepgram.com/learn/nova-3-speech-to-text-api">nova-3</a></strong> for speech-to-text (turning what the person says into text)</p></li><li><p><strong><a href="https://groq.com/">Groq</a> with <a href="https://www.llama.com/">Llama 3.3 70B</a></strong> for generating responses</p></li><li><p><strong><a href="https://deepgram.com/">Deepgram</a> Amalthea</strong> voice for text-to-speech (turning the response back into natural-sounding audio with a Filipino English accent)</p></li><li><p><strong><a href="https://jmvalin.ca/demo/rnnoise/">RNNoise</a></strong> for cleaning up background noise</p></li><li><p><strong><a href="https://github.com/snakers4/silero-vad">Silero</a> VAD + Smart Turn v3</strong> for knowing when the person has finished talking</p></li></ul><p>The main difference is the <strong>system prompt</strong>. Since the bot is the one making the call, it needs to introduce itself and explain why it&#8217;s calling - you wouldn&#8217;t want it to just sit there waiting for the other person to speak first. The prompt tells the bot:</p><blockquote><p>&#8220;You are a friendly assistant making an outbound phone call. Your responses will be read aloud, so keep them concise and conversational. Avoid special characters or formatting. Begin by politely greeting the person and explaining why you&#8217;re calling.&#8221;</p></blockquote><p>And just like the inbound bot, if the person goes quiet for more than 5 seconds, the bot picks up the conversation and keeps things moving.</p><h2><strong>Setting Up and Making Your First Call</strong></h2><p>The setup is almost the same as the inbound bot. You need the same accounts (<a href="https://www.twilio.com/">Twilio</a>, <a href="https://console.groq.com/">Groq</a>, <a href="https://deepgram.com/">Deepgram</a>, <a href="https://modal.com/">Modal</a>), plus one difference: the outbound bot also uses the <code>twilio</code> Python library to make API calls for starting calls. All of these services have free tiers - a handful of test calls won&#8217;t come close to using up the free credits.</p><h3><strong>Deploy to Modal</strong></h3><pre><code>cd twilio-chatbot/outbound
cp env.example .env
# Fill in your API keys in .env
modal serve modal_app.py</code></pre><p>This gives you a URL like: </p><p>https://your-workspace--twilio-outbound-bot-serve.modal.run</p><h3><strong>Make a Call</strong></h3><p>Now the fun part. To make the bot call someone, you just send a request:</p><pre><code>curl -X POST https://your-workspace--twilio-outbound-bot-serve.modal.run/dialout \
  -H &#8220;Content-Type: application/json&#8221; \
  -d &#8216;{&#8221;to_number&#8221;: &#8220;+15551234567&#8221;, &#8220;from_number&#8221;: &#8220;+15559876543&#8221;}&#8217;</code></pre><p>Replace <code>to_number</code> with the number you want to call and <code>from_number</code> with your Twilio phone number.</p><p>The repo also includes a handy test script that reads the numbers from your <code>.env</code> file:</p><pre><code>cd scripts
uv run python test_call.py https://your-workspace--twilio-outbound-bot-serve.modal.run</code></pre><p>A few seconds after you run this, the phone will ring. When you pick up, the bot will greet you and start a conversation. Here&#8217;s what to expect: you&#8217;ll hear a brief pause (Modal needs to spin up a worker), then the bot will introduce itself. Say something back - you&#8217;re having a conversation with your own outbound voice AI.</p><p>Notice something? With the inbound bot, you had to configure a webhook in the <a href="https://console.twilio.com/">Twilio Console</a>. With outbound, you don&#8217;t. Your code is the one calling Twilio - not the other way around. Twilio only calls your <code>/twiml</code> endpoint because your code tells it to (in the <code>/dialout</code> handler). No console configuration needed.</p><h3><strong>Recordings</strong></h3><p>Every call is automatically recorded using <a href="https://www.twilio.com/docs/voice/tutorials/how-to-record-phone-calls">Twilio&#8217;s built-in call recording</a>. When the WebSocket connects, the bot starts a dual-channel recording via the <a href="https://www.twilio.com/docs/voice/api/recording-resource">Twilio REST API</a>. Recordings are stored in your Twilio account and accessible through the <a href="https://console.twilio.com/">Twilio Console</a> or API - no application-level storage needed.</p><p>This keeps things simple: no recording files to manage, no volumes to mount, and Twilio handles the audio encoding and storage for you.</p><h2><strong>Running Locally with Docker</strong></h2><p><a href="https://modal.com/">Modal</a> is great for quick deployments, but sometimes you want to run things on your own machine - maybe for faster iteration, or to see the logs in real time, or just because you want to understand what&#8217;s happening without a cloud layer in between.</p><p>The outbound bot supports this through <a href="https://docs.docker.com/get-docker/">Docker</a>.</p><p>There&#8217;s one catch, though: Twilio needs to reach your server from the internet to fetch TwiML and connect the WebSocket. Your laptop is usually behind a firewall (meaning the outside world can&#8217;t reach it directly), so Twilio can&#8217;t talk to it.</p><p>The solution is a <strong>tunnel</strong> - a tool that gives your local machine a temporary public URL. VS Code has this built in (they call it &#8220;<a href="https://code.visualstudio.com/docs/editor/port-forwarding">dev tunnels</a>&#8220;), and it works well:</p><ol><li><p>In VS Code, open the <strong>Ports</strong> panel</p></li><li><p>Forward port <strong>7860</strong> and set it to <strong>Public</strong> (Twilio can&#8217;t use authenticated tunnels)</p></li><li><p>Copy the forwarded address - it&#8217;ll look like </p></li></ol><p>https://something-7860-region.devtunnels.ms</p><p>Now run the bot:</p><pre><code>cd outbound
export PROXY_HOST=something-7860-region.devtunnels.ms
docker compose up --build</code></pre><p>The <code>PROXY_HOST</code> variable is important - it tells the bot what public URL to put in the <a href="https://www.twilio.com/docs/voice/twiml">TwiML</a> response so Twilio knows where to open the WebSocket. Without it, Twilio would try to connect to <code>localhost</code>, which obviously won&#8217;t work.</p><p>Once the container is running, you can trigger calls the same way - just point the curl command at your tunnel URL instead of the Modal URL.</p><p>Recordings in Docker mode get saved to a <code>./recordings/</code> folder on your machine (mounted as a Docker volume), so you can listen to them right away. See the <a href="https://github.com/jaeyow/twilio-deepgram-voice/blob/main/blog-outbound.md">repo&#8217;s README</a> for the full Docker setup details.</p><h2><strong>Wrapping Up</strong></h2><p>Over these two articles, we&#8217;ve built two complete voice AI applications:</p><ol><li><p><strong>Inbound</strong>: An AI that answers phone calls and has conversations with callers</p></li><li><p><strong>Outbound</strong>: An AI that dials phone numbers and starts conversations with people</p></li></ol><p>The core is the same - a <a href="https://docs.pipecat.ai/">Pipecat</a> pipeline that connects speech recognition, an LLM, and text-to-speech into a real-time conversation engine. What changes between inbound and outbound is just the setup: who starts the call and how the connection gets wired up.</p><p>The beauty of this pipeline approach is that every piece is swappable. Want a different LLM? Swap out Groq for OpenAI or Anthropic. Want a different voice? Change the <a href="https://developers.deepgram.com/docs/tts-models">Deepgram TTS model</a>. Want the bot to behave differently? Edit the system prompt. The pipeline structure stays the same.</p><p>A few ideas for where you could take this next:</p><ul><li><p><strong>Connect to a CRM</strong> - Look up the caller&#8217;s info and personalise the conversation</p></li><li><p><strong>Appointment reminders</strong> - Schedule outbound calls with specific messages</p></li><li><p><strong>Surveys and feedback</strong> - Have the bot ask questions and record the answers</p></li><li><p><strong>Multi-language support</strong> - Swap in STT and TTS models for different languages</p></li></ul><p>The full code for both bots is in the <a href="https://github.com/jaeyow/twilio-deepgram-voice/blob/main/blog-outbound.md">GitHub repo</a>. If you build something with it, I&#8217;d love to hear about it.</p>]]></content:encoded></item><item><title><![CDATA[#13 - Voice AI Over the Phone: A Practical Guide with Pipecat, Twilio, Deepgram and Modal]]></title><description><![CDATA[Building a full-featured inbound Voice AI application with a $0 budget]]></description><link>https://www.fullstackml.dev/p/voice-ai-over-the-phone-a-practical</link><guid isPermaLink="false">https://www.fullstackml.dev/p/voice-ai-over-the-phone-a-practical</guid><dc:creator><![CDATA[JO Reyes]]></dc:creator><pubDate>Sun, 08 Feb 2026 10:43:54 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!HKBu!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F56cc72b1-525c-4c67-a6ab-5dda188e97f2_1696x608.heic" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>A few weeks ago, I was scrolling through my feed and kept seeing demos of voice AI assistants. People were having full conversations with AI over the phone - not the clunky &#8220;press 1 for billing&#8221; kind, but actual back-and-forth dialogue. It sounded natural. It sounded <em>fast</em>.</p><p>My first thought was: &#8220;That&#8217;s cool, but it&#8217;s probably built by a team of 20 engineers at some well-funded startup.&#8221;</p><p>My second thought was: &#8220;...but what if it&#8217;s not?&#8221;</p><p>So I went down the rabbit hole. I&#8217;ve been a full-stack developer for years - I&#8217;m comfortable with APIs, backends, cloud infrastructure, all of that. But telephony and audio processing? That was completely new territory for me. I wanted to see if I could build a voice AI that answers real phone calls. Not a toy demo. A real phone number you can call, have a conversation with, and hang up.</p><p>Turns out, the tools have gotten really good. After a weekend of tinkering, I had a working voice bot named Miss Harper - an elementary school teacher who answers phone calls and chats with you in a friendly Australian accent. You call the number, she picks up, says hello, and you can talk to her like a normal person. She listens, thinks, and talks back - all in real time.</p><p>Don&#8217;t take my word for it - here&#8217;s a recording of an actual call with Miss Harper:</p><div class="native-audio-embed" data-component-name="AudioPlaceholder" data-attrs="{&quot;label&quot;:null,&quot;mediaUploadId&quot;:&quot;ced9e093-e692-4051-b4b9-7d50f8a80e9c&quot;,&quot;duration&quot;:194.3249,&quot;downloadable&quot;:true,&quot;isEditorNode&quot;:true}"></div><p>Hear Miss Harper tell me a story about a little seed named Sammy and how he became a great big tree! Miss Harper can even do basic mathematics!<br><br>In this article, I&#8217;ll walk you through exactly how I built it, step by step. By the end, you&#8217;ll have your own AI-powered phone number that picks up calls and has conversations. No prior experience with voice, audio, or telephony needed.</p><p>Let&#8217;s get into it.</p><h2><strong>What We&#8217;re Building</strong></h2><p>Here&#8217;s the end result: you&#8217;ll have a phone number. When someone calls it, an AI bot picks up, introduces itself, and has a natural voice conversation with the caller. </p><p>Under the hood, the flow looks like this:</p><pre><code><code>You call a phone number
       &#8595;
   Twilio (handles the phone call)
       &#8595;
   WebSocket (streams audio back and forth)
       &#8595;
   Pipecat (processes the audio through AI services)
       &#8595;
   You hear the bot respond</code></code></pre><p>The key idea is simple: it&#8217;s a <strong>pipeline</strong>. Audio goes in one end, gets turned into text, the AI generates a reply, that reply gets turned back into speech, and the speech goes back out. Everything happens in real time - under a second from when you stop talking to when you hear the response.</p><p>Here&#8217;s what the Pipecat pipeline looks like inside - each module feeds into the next, like an assembly line:<br></p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!6hDd!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe8d935c8-3dc5-456a-87b9-e9c296327670_5238x410.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!6hDd!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe8d935c8-3dc5-456a-87b9-e9c296327670_5238x410.png 424w, https://substackcdn.com/image/fetch/$s_!6hDd!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe8d935c8-3dc5-456a-87b9-e9c296327670_5238x410.png 848w, https://substackcdn.com/image/fetch/$s_!6hDd!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe8d935c8-3dc5-456a-87b9-e9c296327670_5238x410.png 1272w, https://substackcdn.com/image/fetch/$s_!6hDd!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe8d935c8-3dc5-456a-87b9-e9c296327670_5238x410.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!6hDd!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe8d935c8-3dc5-456a-87b9-e9c296327670_5238x410.png" width="1456" height="114" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/e8d935c8-3dc5-456a-87b9-e9c296327670_5238x410.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:114,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:175001,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://fullstackml.dev/i/187161946?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe8d935c8-3dc5-456a-87b9-e9c296327670_5238x410.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!6hDd!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe8d935c8-3dc5-456a-87b9-e9c296327670_5238x410.png 424w, https://substackcdn.com/image/fetch/$s_!6hDd!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe8d935c8-3dc5-456a-87b9-e9c296327670_5238x410.png 848w, https://substackcdn.com/image/fetch/$s_!6hDd!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe8d935c8-3dc5-456a-87b9-e9c296327670_5238x410.png 1272w, https://substackcdn.com/image/fetch/$s_!6hDd!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe8d935c8-3dc5-456a-87b9-e9c296327670_5238x410.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><p>This cascading design is what makes Pipecat powerful. Each piece does one job and passes its output to the next. You can swap any piece out - use a different LLM, a different voice, a different speech recognition model - without touching the rest of the pipeline.</p><h2><strong>The Building Blocks</strong></h2><p>Before we start building, let me introduce the tools we&#8217;ll use and explain why each one matters.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!HKBu!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F56cc72b1-525c-4c67-a6ab-5dda188e97f2_1696x608.heic" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!HKBu!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F56cc72b1-525c-4c67-a6ab-5dda188e97f2_1696x608.heic 424w, https://substackcdn.com/image/fetch/$s_!HKBu!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F56cc72b1-525c-4c67-a6ab-5dda188e97f2_1696x608.heic 848w, https://substackcdn.com/image/fetch/$s_!HKBu!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F56cc72b1-525c-4c67-a6ab-5dda188e97f2_1696x608.heic 1272w, https://substackcdn.com/image/fetch/$s_!HKBu!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F56cc72b1-525c-4c67-a6ab-5dda188e97f2_1696x608.heic 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!HKBu!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F56cc72b1-525c-4c67-a6ab-5dda188e97f2_1696x608.heic" width="1456" height="522" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/56cc72b1-525c-4c67-a6ab-5dda188e97f2_1696x608.heic&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:522,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:215437,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/heic&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://fullstackml.dev/i/187161946?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F56cc72b1-525c-4c67-a6ab-5dda188e97f2_1696x608.heic&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!HKBu!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F56cc72b1-525c-4c67-a6ab-5dda188e97f2_1696x608.heic 424w, https://substackcdn.com/image/fetch/$s_!HKBu!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F56cc72b1-525c-4c67-a6ab-5dda188e97f2_1696x608.heic 848w, https://substackcdn.com/image/fetch/$s_!HKBu!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F56cc72b1-525c-4c67-a6ab-5dda188e97f2_1696x608.heic 1272w, https://substackcdn.com/image/fetch/$s_!HKBu!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F56cc72b1-525c-4c67-a6ab-5dda188e97f2_1696x608.heic 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p></p><p><strong><a href="https://www.twilio.com/">Twilio</a></strong> is our phone company. It gives us a real phone number and handles all the complicated stuff about connecting phone calls to the internet. When someone calls your Twilio number, Twilio asks your server what to do with the call. Think of Twilio as the bridge between the old-school phone network and our modern AI code.</p><p><strong><a href="https://docs.pipecat.ai/">Pipecat</a></strong> is the glue that holds everything together. It&#8217;s an open-source framework that lets you build real-time audio AI applications by connecting different services into a pipeline. Instead of writing all the plumbing code yourself - handling audio streams, managing conversation turns, coordinating between services - Pipecat handles all of that. You just tell it what services to use and in what order.</p><p><strong><a href="https://groq.com/">Groq</a></strong> is the brain. It runs the large language model (LLM) - basically the same kind of AI that powers ChatGPT, but running on Groq&#8217;s super-fast hardware. We use a model called <a href="https://www.llama.com/">Llama 3.3 70B</a>. The speed matters a lot here: in a phone conversation, even a half-second delay feels awkward. Groq is one of the fastest options available.</p><p><strong><a href="https://deepgram.com/">Deepgram</a></strong> pulls double duty - it&#8217;s both the ears and the mouth. Deepgram&#8217;s speech-to-text (STT) service listens to what the caller says and converts it to text so the AI can understand it. Then Deepgram&#8217;s text-to-speech (TTS) service takes the AI&#8217;s text response and converts it back to natural-sounding speech. We use their &#8220;Theia&#8221; voice, which has a warm Australian accent.</p><p><strong><a href="https://github.com/snakers4/silero-vad">Silero</a></strong> is the social awareness layer. It handles something called voice activity detection (VAD), which is just a fancy way of saying &#8220;it knows when you&#8217;re talking and when you&#8217;ve stopped.&#8221; On top of that, we use something called Smart Turn v3, which is smarter than just listening for silence. It uses machine learning to figure out when you&#8217;ve actually finished your thought - not just paused to take a breath. This is what makes the conversation feel natural instead of robotic.</p><p><strong><a href="https://modal.com/">Modal</a></strong> is where our code runs. It&#8217;s a cloud platform that lets you deploy Python apps without setting up servers. You write your code, run one command, and it&#8217;s live on the internet with a public URL. It also auto-scales - if nobody&#8217;s calling, it costs nothing. If ten people call at once, it spins up more workers.</p><h2><strong>How It Actually Works</strong></h2><p>Let me walk you through what happens when someone calls your bot, step by step.</p><p><strong>Step 1: The phone rings.</strong> Someone dials your Twilio phone number. This is a normal phone call - they could be calling from a mobile phone, a landline, whatever.</p><p><strong>Step 2: Twilio asks your server what to do.</strong> The moment the call connects, Twilio sends a message to your server (an HTTP POST request) saying, &#8220;Hey, someone&#8217;s calling. What should I do?&#8221; Your server responds with a small piece of XML called <a href="https://www.twilio.com/docs/voice/twiml">TwiML</a> (Twilio Markup Language). Think of TwiML as a short instruction sheet for Twilio - in our case it just says: &#8220;Open a WebSocket connection back to me and start streaming the audio.&#8221;</p><p>A WebSocket is just a persistent two-way connection - think of it like opening a phone line between Twilio and your server where data (in this case, audio) can flow in both directions at the same time.</p><p><strong>Step 3: Audio starts flowing.</strong> Once the WebSocket is open, Twilio starts streaming the caller&#8217;s audio to your server in real time via <a href="https://www.twilio.com/docs/voice/media-streams">Media Streams</a> - small chunks of sound, one after another. Your server can also send audio back through the same connection, and Twilio plays it to the caller.</p><p><strong>Step 4: The Pipecat pipeline kicks in.</strong> This is where the magic happens. The audio goes through a series of steps, one after another:</p><ol><li><p><strong>Noise filter</strong> - An <a href="https://jmvalin.ca/demo/rnnoise/">RNNoise</a> filter cleans up background noise from the caller&#8217;s audio. This makes the speech recognition more accurate.</p></li><li><p><strong>Voice activity detection</strong> - Silero detects that someone is actually talking (as opposed to silence or background noise).</p></li><li><p><strong>Speech-to-text</strong> - Deepgram&#8217;s <a href="https://deepgram.com/learn/nova-3-speech-to-text-api">nova-3</a> model converts the spoken words into text in real time.</p></li><li><p><strong>AI thinking</strong> - The text goes to Groq&#8217;s Llama 3.3 model, which generates a response based on the conversation history and the bot&#8217;s personality (defined by a system prompt - more on that later).</p></li><li><p><strong>Text-to-speech</strong> - Deepgram&#8217;s Theia voice converts the AI&#8217;s text response into spoken audio.</p></li><li><p><strong>Audio out</strong> - The audio streams back through the WebSocket to Twilio, and Twilio plays it to the caller.</p></li></ol><p>All of this happens fast - typically under a second from when you finish talking to when you hear the response. One reason it feels so quick: the LLM doesn&#8217;t wait until it&#8217;s finished writing the whole reply before handing it off. It sends words to the text-to-speech service as it generates them, so the caller starts hearing the response while the AI is still thinking of the rest. It works like an assembly line, not a batch job.</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!SUOp!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F852e0818-08de-4b51-b3e1-32068faedfe3_532x76.heic" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!SUOp!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F852e0818-08de-4b51-b3e1-32068faedfe3_532x76.heic 424w, https://substackcdn.com/image/fetch/$s_!SUOp!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F852e0818-08de-4b51-b3e1-32068faedfe3_532x76.heic 848w, https://substackcdn.com/image/fetch/$s_!SUOp!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F852e0818-08de-4b51-b3e1-32068faedfe3_532x76.heic 1272w, https://substackcdn.com/image/fetch/$s_!SUOp!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F852e0818-08de-4b51-b3e1-32068faedfe3_532x76.heic 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!SUOp!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F852e0818-08de-4b51-b3e1-32068faedfe3_532x76.heic" width="532" height="76" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/852e0818-08de-4b51-b3e1-32068faedfe3_532x76.heic&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:76,&quot;width&quot;:532,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:12362,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/heic&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://fullstackml.dev/i/187161946?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F852e0818-08de-4b51-b3e1-32068faedfe3_532x76.heic&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!SUOp!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F852e0818-08de-4b51-b3e1-32068faedfe3_532x76.heic 424w, https://substackcdn.com/image/fetch/$s_!SUOp!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F852e0818-08de-4b51-b3e1-32068faedfe3_532x76.heic 848w, https://substackcdn.com/image/fetch/$s_!SUOp!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F852e0818-08de-4b51-b3e1-32068faedfe3_532x76.heic 1272w, https://substackcdn.com/image/fetch/$s_!SUOp!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F852e0818-08de-4b51-b3e1-32068faedfe3_532x76.heic 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a><figcaption class="image-caption">Deepgram speech to text (STT) latency at around 3.54 ms</figcaption></figure></div><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!qk9T!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F30ac78b2-59ef-44b5-a024-a6ff2ad5b86c_683x118.heic" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!qk9T!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F30ac78b2-59ef-44b5-a024-a6ff2ad5b86c_683x118.heic 424w, https://substackcdn.com/image/fetch/$s_!qk9T!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F30ac78b2-59ef-44b5-a024-a6ff2ad5b86c_683x118.heic 848w, https://substackcdn.com/image/fetch/$s_!qk9T!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F30ac78b2-59ef-44b5-a024-a6ff2ad5b86c_683x118.heic 1272w, https://substackcdn.com/image/fetch/$s_!qk9T!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F30ac78b2-59ef-44b5-a024-a6ff2ad5b86c_683x118.heic 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!qk9T!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F30ac78b2-59ef-44b5-a024-a6ff2ad5b86c_683x118.heic" width="683" height="118" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/30ac78b2-59ef-44b5-a024-a6ff2ad5b86c_683x118.heic&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:118,&quot;width&quot;:683,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:26083,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/heic&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://fullstackml.dev/i/187161946?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F30ac78b2-59ef-44b5-a024-a6ff2ad5b86c_683x118.heic&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!qk9T!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F30ac78b2-59ef-44b5-a024-a6ff2ad5b86c_683x118.heic 424w, https://substackcdn.com/image/fetch/$s_!qk9T!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F30ac78b2-59ef-44b5-a024-a6ff2ad5b86c_683x118.heic 848w, https://substackcdn.com/image/fetch/$s_!qk9T!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F30ac78b2-59ef-44b5-a024-a6ff2ad5b86c_683x118.heic 1272w, https://substackcdn.com/image/fetch/$s_!qk9T!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F30ac78b2-59ef-44b5-a024-a6ff2ad5b86c_683x118.heic 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a><figcaption class="image-caption">Groq LLM latency at 659.1 ms and Deepgram text to speech (TTS) at 2.04 ms</figcaption></figure></div><p><strong>Voice-to-voice latency</strong> is the time from when the <strong>user finishes speaking</strong> to when they <strong>hear the bot&#8217;s first word of response</strong>.</p><p>It encompasses the entire chain:</p><ol><li><p><strong>Audio travels</strong> from the phone to Twilio to your server</p></li><li><p><strong>STT</strong> converts the speech to text</p></li><li><p><strong>Turn detection</strong> confirms the user is done speaking</p></li><li><p><strong>LLM</strong> generates the first tokens of a response</p></li><li><p><strong>TTS</strong> converts those tokens to audio</p></li><li><p><strong>Audio travels</strong> back through Twilio to the caller&#8217;s phone</p></li></ol><p>So in the above example, the voice to voice latency for the inbound caller application we are building is in the vicinity of around 600 to 1000ms. Not bad at all, considering we are running everything on CPU, and just using free-tier API models from <strong>Deepgram</strong> and <strong>Groq</strong>!</p><p><strong>Step 5: Turn detection makes it natural.</strong> Here&#8217;s a subtle but important piece: how does the bot know when you&#8217;ve finished talking? Simple silence detection would be too aggressive - it would cut you off every time you pause to think. Smart Turn v3 uses machine learning to detect natural endpoints in speech. It understands the difference between a pause in the middle of a sentence and actually being done talking.</p><p><strong>Step 6: The conversation continues.</strong> The bot keeps the full conversation history in memory for the duration of the call, so it remembers what you said earlier. (Each call starts fresh - it doesn&#8217;t remember previous calls.) If you go quiet for too long (5 seconds), the bot will take the initiative and continue the conversation on its own.</p><p><strong>Step 7: Recording.</strong> The bot kicks off a <a href="https://www.twilio.com/docs/voice/tutorials/how-to-record-phone-calls">Twilio-side recording</a> via the REST API as soon as the call connects. Twilio records in dual-channel (one channel per participant), so you get clean, separated audio without any memory overhead in your application. Recordings are stored in your Twilio account and accessible from the <a href="https://console.twilio.com/">Twilio Console</a> or via the API.</p><h2><strong>Setting Up Your Own</strong></h2><p>Ready to build it? Here&#8217;s what you need.</p><h3><strong>Accounts to Create (All Free Tiers Available)</strong></h3><p>You&#8217;ll need accounts with four services. All of them have free tiers, and a handful of test calls won&#8217;t come close to using up the free credits.</p><ol><li><p><strong>Twilio</strong> - Sign up at <a href="https://www.twilio.com/">twilio.com</a>. You&#8217;ll need a phone number that supports voice calls. Twilio&#8217;s trial account gives you one for free.</p></li><li><p><strong>Groq</strong> - Get an API key at <a href="https://console.groq.com/">console.groq.com</a>. The free tier is generous enough for testing.</p></li><li><p><strong>Deepgram</strong> - Get an API key at <a href="https://deepgram.com/">deepgram.com</a>. Free credits are included when you sign up.</p></li><li><p><strong>Modal</strong> - Sign up at <a href="https://modal.com/">modal.com</a>. The free tier gives you enough compute for plenty of test calls.</p></li></ol><h3><strong>Deploy the Bot</strong></h3><p>First, install the Modal CLI and set it up:</p><pre><code><code>pip install modal
modal setup</code></code></pre><p>Clone the repo and set up your keys:</p><pre><code><code>git clone https://github.com/jaeyow/twilio-chatbot.git
cd twilio-chatbot/inbound
cp env.example .env</code></code></pre><p>Open the <code>.env</code> file and fill in your API keys:</p><pre><code><code>GROQ_API_KEY=your_groq_api_key
DEEPGRAM_API_KEY=your_deepgram_api_key
TWILIO_ACCOUNT_SID=your_twilio_account_sid
TWILIO_AUTH_TOKEN=your_twilio_auth_token</code></code></pre><p>Now deploy to Modal:</p><pre><code><code>modal serve modal_app.py</code></code></pre><p>This gives you a public URL that looks something like:<br></p><p>https://your-workspace--twilio-inbound-bot-serve.modal.run</p><h3><strong>Point Twilio to Your Bot</strong></h3><p>The last step is telling Twilio to send incoming calls to your bot:</p><ol><li><p>Go to your <a href="https://console.twilio.com/">Twilio Console</a></p></li><li><p>Navigate to <strong>Phone Numbers &#8594; Manage &#8594; Active numbers</strong></p></li><li><p>Click on your phone number</p></li><li><p>Under <strong>Voice Configuration</strong>, set &#8220;A call comes in&#8221; to:</p><ul><li><p><strong>Webhook</strong></p></li><li><p><strong>URL</strong>: paste your Modal URL</p></li><li><p><strong>Method</strong>: POST</p></li></ul></li><li><p>Click <strong>Save configuration</strong></p></li></ol><p>That&#8217;s it. Call your Twilio number. It might take a second or two to connect the first time (Modal needs to spin up a worker), but then Miss Harper will pick up and greet you. Say something back - you&#8217;re having a conversation with your own voice AI.</p><h3><strong>Running Locally with Docker (Alternative)</strong></h3><p>If you&#8217;d rather run the bot on your own machine instead of Modal, you can use <a href="https://docs.docker.com/compose/">Docker Compose</a>. The main thing you&#8217;ll need is a way for Twilio to reach your laptop from the internet, since Twilio needs a public URL to send audio to. VS Code has built-in <a href="https://code.visualstudio.com/docs/editor/port-forwarding">dev tunnels</a> that work well for this.</p><p>The steps are: forward port 7860 in VS Code&#8217;s Ports panel (set it to Public), grab the tunnel URL, then run:</p><pre><code><code>cd inbound
export PROXY_HOST=your-tunnel-hostname
docker compose up --build</code></code></pre><p>Point your Twilio webhook to the tunnel URL instead of the Modal URL, and everything works the same way. See the <a href="https://github.com/jaeyow/twilio-deepgram-voice/tree/main/inbound">repo&#8217;s README</a> for the full Docker setup details.</p><h2><strong>Giving Your Bot a Personality</strong></h2><p>The bot&#8217;s personality comes from one thing: the <strong>system prompt</strong>. This is the instruction you give the AI that tells it who it is and how to behave. Here&#8217;s the one we use:</p><blockquote><p>&#8220;You are Miss Harper, an elementary school teacher in an audio call. Your output will be converted to audio so don&#8217;t include special characters in your answers. Respond to what the student said in a short sentence.&#8221;</p></blockquote><p>You can change this to anything you want. Make it a customer service agent, a language tutor, a trivia host - whatever fits your use case. Just remember to tell it that responses will be read aloud, so it should avoid things like bullet points, URLs, or special characters.</p><p>You can also change the <strong>voice</strong>. We use Deepgram&#8217;s &#8220;aura-2-theia-en&#8221; voice (a warm Australian accent), but they have <a href="https://developers.deepgram.com/docs/tts-models">other voices</a> to choose from.</p><p>One more thing worth mentioning: if the caller goes quiet for more than 5 seconds, the bot doesn&#8217;t just sit there in awkward silence. It takes the lead and continues the conversation. In our case, Miss Harper keeps teaching. You can adjust this timeout or change what the bot says when it kicks in.</p><h2><strong>What&#8217;s Next</strong></h2><p>So there you have it - a voice AI that answers real phone calls, built with open-source tools and free-tier cloud services. The whole thing is about 270 lines of Python spread across two files.</p><p>But here&#8217;s what kept nagging me after I got this working: the bot only talks when <em>I</em> call <em>it</em>. What if I want the bot to pick up the phone and call <em>someone else</em>?</p><p>That&#8217;s exactly what I built next - an outbound voice bot that dials people and starts conversations. I&#8217;ll walk through that in the next article.</p><p>If you want to see the full code, it&#8217;s all in the <a href="https://github.com/jaeyow/twilio-deepgram-voice/tree/main/inbound">GitHub repo</a>. Questions or ideas? I&#8217;d love to hear from you.</p>]]></content:encoded></item><item><title><![CDATA[#12 - A Developer’s Guide to Failure Analysis and Evaluation for Single-turn and Multi-turn Chatbots - Part 2]]></title><description><![CDATA[Stop Guessing and Start Auditing: How qualitative analysis turns "vibe checks" into engineering.]]></description><link>https://www.fullstackml.dev/p/12-a-developers-guide-to-failure</link><guid isPermaLink="false">https://www.fullstackml.dev/p/12-a-developers-guide-to-failure</guid><dc:creator><![CDATA[JO Reyes]]></dc:creator><pubDate>Sun, 11 Jan 2026 08:51:29 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!yCAV!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F03c5a8e0-ee68-48b9-88c4-5e70f6290e9b_1024x1024.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p><a href="https://fullstackml.dev/p/11-a-developers-guide-to-failure">In Part 1,</a> we saw how <a href="https://github.com/jaeyow/multi_turn_chatbot_eval">JO&#8217;s Bike Shop Chatbot</a> could sound perfectly professional while quietly leading a customer to show up at a closed shop for a service we don&#8217;t even offer. If you&#8217;re building LLM apps, that &#8220;plausible-but-wrong&#8221; drift is your biggest enemy.</p><p>But how do we stop guessing and start fixing? It all starts with <strong>Traces</strong>.</p><p>If an LLM application is an aeroplane, a&nbsp;<strong>Trace</strong>&nbsp;is the Flight Data Recorder (the &#8220;Black Box&#8221;). When the plane lands safely or crashes, you don&#8217;t just look at the wreckage; you look at the logs to see exactly what the sensors were reporting at every altitude.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!yCAV!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F03c5a8e0-ee68-48b9-88c4-5e70f6290e9b_1024x1024.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!yCAV!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F03c5a8e0-ee68-48b9-88c4-5e70f6290e9b_1024x1024.png 424w, https://substackcdn.com/image/fetch/$s_!yCAV!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F03c5a8e0-ee68-48b9-88c4-5e70f6290e9b_1024x1024.png 848w, https://substackcdn.com/image/fetch/$s_!yCAV!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F03c5a8e0-ee68-48b9-88c4-5e70f6290e9b_1024x1024.png 1272w, https://substackcdn.com/image/fetch/$s_!yCAV!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F03c5a8e0-ee68-48b9-88c4-5e70f6290e9b_1024x1024.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!yCAV!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F03c5a8e0-ee68-48b9-88c4-5e70f6290e9b_1024x1024.png" width="1024" height="1024" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/03c5a8e0-ee68-48b9-88c4-5e70f6290e9b_1024x1024.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1024,&quot;width&quot;:1024,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1275441,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://fullstackml.dev/i/182654507?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F03c5a8e0-ee68-48b9-88c4-5e70f6290e9b_1024x1024.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!yCAV!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F03c5a8e0-ee68-48b9-88c4-5e70f6290e9b_1024x1024.png 424w, https://substackcdn.com/image/fetch/$s_!yCAV!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F03c5a8e0-ee68-48b9-88c4-5e70f6290e9b_1024x1024.png 848w, https://substackcdn.com/image/fetch/$s_!yCAV!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F03c5a8e0-ee68-48b9-88c4-5e70f6290e9b_1024x1024.png 1272w, https://substackcdn.com/image/fetch/$s_!yCAV!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F03c5a8e0-ee68-48b9-88c4-5e70f6290e9b_1024x1024.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p></p><h2>What are we building again?</h2><p>To keep things simple, I&#8217;ve been using a <a href="https://github.com/jaeyow/multi_turn_chatbot_eval/blob/main/streamlit_app.py">Streamlit-based interface</a> for JO&#8217;s Bike Shop. It looks like a standard chat window, but behind the scenes, it&#8217;s powered by a state machine that tracks whether the user is just &#8220;browsing&#8221; or trying to &#8220;book&#8221; a repair.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!KCkZ!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcc46cd18-9180-4065-9868-9fb54a9561d9_1588x1658.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!KCkZ!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcc46cd18-9180-4065-9868-9fb54a9561d9_1588x1658.png 424w, https://substackcdn.com/image/fetch/$s_!KCkZ!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcc46cd18-9180-4065-9868-9fb54a9561d9_1588x1658.png 848w, https://substackcdn.com/image/fetch/$s_!KCkZ!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcc46cd18-9180-4065-9868-9fb54a9561d9_1588x1658.png 1272w, https://substackcdn.com/image/fetch/$s_!KCkZ!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcc46cd18-9180-4065-9868-9fb54a9561d9_1588x1658.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!KCkZ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcc46cd18-9180-4065-9868-9fb54a9561d9_1588x1658.png" width="1456" height="1520" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/cc46cd18-9180-4065-9868-9fb54a9561d9_1588x1658.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1520,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:282639,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://fullstackml.dev/i/182654507?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcc46cd18-9180-4065-9868-9fb54a9561d9_1588x1658.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!KCkZ!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcc46cd18-9180-4065-9868-9fb54a9561d9_1588x1658.png 424w, https://substackcdn.com/image/fetch/$s_!KCkZ!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcc46cd18-9180-4065-9868-9fb54a9561d9_1588x1658.png 848w, https://substackcdn.com/image/fetch/$s_!KCkZ!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcc46cd18-9180-4065-9868-9fb54a9561d9_1588x1658.png 1272w, https://substackcdn.com/image/fetch/$s_!KCkZ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcc46cd18-9180-4065-9868-9fb54a9561d9_1588x1658.png 1456w" sizes="100vw"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h2><strong>Step One: Trace Collection</strong></h2><p>You cannot evaluate what you haven&#8217;t recorded. Before you can build fancy &#8220;AI Evaluators&#8221; or automated tests, you need a library of real (or high-quality synthetic) conversations. Go synthetic is especially critical when you have a brand new application, where you need &#8220;real-world&#8221; traces, but don&#8217;t have real users to collect real traces from. </p><p>I use <strong><a href="https://burr.apache.org/">Apache Burr</a></strong> for this. Burr is a state-management library that essentially takes a &#8220;snapshot&#8221; of the chatbot&#8217;s brain at every turn. It records:</p><ul><li><p><strong>The Input:</strong> What the user said.</p></li><li><p><strong>The State:</strong> What the chatbot &#8220;thought&#8221; was happening (e.g., &#8220;User is asking about Saturday hours&#8221;).</p></li><li><p><strong>The Result:</strong> The actual text sent back.</p></li></ul><p>While I&#8217;m using Burr because it fits my workflow, the tool itself matters less than the <strong>habit</strong>. Whether you use <a href="https://www.langchain.com/langgraph">LangGraph</a> or say <a href="https://ai.pydantic.dev/">Pydantic AI</a>, these libraries allow you to save traces just like Burr, although you&#8217;ll have to confirm by yourself if they allow you to direct them to a local OTel collectors or even to local files. Or even just save your own JSON logs to a database; <strong>tracing is the fundamental foundation of all evaluation work.</strong></p><h3>What does a Trace actually look like?</h3><p>It&#8217;s not just a chat log. It&#8217;s a data object. Here is a simplified snapshot of a trace from a conversation turn of our chatbot:</p><pre><code><code>{
  "turn_number": 3,
  "messages": [
    {
      "role": "user",
      "content": "Do I need to book ahead or can I just walk in?",
    },
    {
      "role": "assistant",
      "content": "Hey there! At JO's Bike Shop, it's best to book ahead for our services to ensure we can give your bike the attention it deserves. Give us a buzz at (503) 555-BIKE or drop by and chat with our team to schedule your tune-up or maintenance. Can't wait to get you back on the road! \ud83d\udeb2",
    }
  ]
},
{
  "turn_number": 4,
  "messages": [
    {
      "role": "user",
      "content": "Cool. I\u2019d like to drop my bike in tomorrow morning 10am then. Is that usually a quiet time?",
    },
    {
      "role": "assistant",
      "content": "Perfect! Let me confirm your appointment:\n\n\u2022 Service: Brake and Gear Cable Replacement\n\u2022 Date: tomorrow\n\u2022 Time: 10am\n\u2022 Issues: squeaky brakes\n\nDoes this look correct? I can book this for you now!",
    }
  ]
}</code></code></pre><p>Looking at this JSON, in Turn 3 the conversation could easily have turned into a hallucination, since the shop requires advance booking for brake and cable tune-ups, which this customer needs for their bike. There are more trace details in the&nbsp;<a href="http://github.com/jaeyow/multi_turn_chatbot_eval/blob/main/multi_turn_chatbot_eval_conversations.json">actual JSON file here,</a>&nbsp;including the state of the chat application at each turn.</p><p>Without this trace, it would be hard to determine, when running your evaluations, whether your bot responds with hallucinations or provides details grounded in facts. This is the foundation for building evaluations for our chatbot. </p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!C6o1!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fccc5cef2-2a14-4bb6-a33e-cdbf0fff39e7_1761x933.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!C6o1!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fccc5cef2-2a14-4bb6-a33e-cdbf0fff39e7_1761x933.png 424w, https://substackcdn.com/image/fetch/$s_!C6o1!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fccc5cef2-2a14-4bb6-a33e-cdbf0fff39e7_1761x933.png 848w, https://substackcdn.com/image/fetch/$s_!C6o1!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fccc5cef2-2a14-4bb6-a33e-cdbf0fff39e7_1761x933.png 1272w, https://substackcdn.com/image/fetch/$s_!C6o1!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fccc5cef2-2a14-4bb6-a33e-cdbf0fff39e7_1761x933.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!C6o1!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fccc5cef2-2a14-4bb6-a33e-cdbf0fff39e7_1761x933.png" width="1456" height="771" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/ccc5cef2-2a14-4bb6-a33e-cdbf0fff39e7_1761x933.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:771,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:239998,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://fullstackml.dev/i/182654507?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fccc5cef2-2a14-4bb6-a33e-cdbf0fff39e7_1761x933.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!C6o1!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fccc5cef2-2a14-4bb6-a33e-cdbf0fff39e7_1761x933.png 424w, https://substackcdn.com/image/fetch/$s_!C6o1!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fccc5cef2-2a14-4bb6-a33e-cdbf0fff39e7_1761x933.png 848w, https://substackcdn.com/image/fetch/$s_!C6o1!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fccc5cef2-2a14-4bb6-a33e-cdbf0fff39e7_1761x933.png 1272w, https://substackcdn.com/image/fetch/$s_!C6o1!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fccc5cef2-2a14-4bb6-a33e-cdbf0fff39e7_1761x933.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">In Apache Burr, chatbot state tracking is built-in</figcaption></figure></div><h3>Synthetic Data</h3><p>When you&#8217;re building your application, especially when you haven&#8217;t released it yet, there will be a shortage of real-world users from whom you can collect traces. One technique is to generate synthetic traces that we can use for error analysis. There are a couple of strategies for this. </p><h4>Single Turn Conversations</h4><p>Building single-turn conversations is simpler. First, we generate the scenarios, review the&nbsp;<a href="https://github.com/jaeyow/multi_turn_chatbot_eval/blob/feat/error-analysis/error_analysis/error_analysis.ipynb">notebook where we generate</a>&nbsp;unique dimension tuples, and use an LLM to generate the scenarios. Once the tuple scenarios have been created, we then use an LLM again to run these synthetic questions and allow our chatbot to respond. </p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!09Jq!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3bd853e8-e4d6-414e-89f0-71d0a80a211f_1872x358.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!09Jq!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3bd853e8-e4d6-414e-89f0-71d0a80a211f_1872x358.png 424w, https://substackcdn.com/image/fetch/$s_!09Jq!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3bd853e8-e4d6-414e-89f0-71d0a80a211f_1872x358.png 848w, https://substackcdn.com/image/fetch/$s_!09Jq!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3bd853e8-e4d6-414e-89f0-71d0a80a211f_1872x358.png 1272w, https://substackcdn.com/image/fetch/$s_!09Jq!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3bd853e8-e4d6-414e-89f0-71d0a80a211f_1872x358.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!09Jq!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3bd853e8-e4d6-414e-89f0-71d0a80a211f_1872x358.png" width="1456" height="278" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/3bd853e8-e4d6-414e-89f0-71d0a80a211f_1872x358.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:278,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:96886,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://fullstackml.dev/i/182654507?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3bd853e8-e4d6-414e-89f0-71d0a80a211f_1872x358.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!09Jq!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3bd853e8-e4d6-414e-89f0-71d0a80a211f_1872x358.png 424w, https://substackcdn.com/image/fetch/$s_!09Jq!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3bd853e8-e4d6-414e-89f0-71d0a80a211f_1872x358.png 848w, https://substackcdn.com/image/fetch/$s_!09Jq!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3bd853e8-e4d6-414e-89f0-71d0a80a211f_1872x358.png 1272w, https://substackcdn.com/image/fetch/$s_!09Jq!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3bd853e8-e4d6-414e-89f0-71d0a80a211f_1872x358.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a><figcaption class="image-caption">Sample of a synthetic scenario for a single-turn conversation</figcaption></figure></div><h4>Multi-turn Conversations</h4><p>Multiple-turn conversations are a bit more complicated to build synthetic conversations from. However, they are not much more difficult with the help of some smart people in the community. One library, called <a href="https://deepeval.com/docs/conversation-simulator">Deep Eval, has a feature called Conversation Simulator</a> that allows us to simulate full conversations between a synthetic user and our chatbot. </p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!2Ht_!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3dbc28da-6f75-45e3-9fea-c3099a3fa4e1_1854x1016.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!2Ht_!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3dbc28da-6f75-45e3-9fea-c3099a3fa4e1_1854x1016.png 424w, https://substackcdn.com/image/fetch/$s_!2Ht_!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3dbc28da-6f75-45e3-9fea-c3099a3fa4e1_1854x1016.png 848w, https://substackcdn.com/image/fetch/$s_!2Ht_!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3dbc28da-6f75-45e3-9fea-c3099a3fa4e1_1854x1016.png 1272w, https://substackcdn.com/image/fetch/$s_!2Ht_!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3dbc28da-6f75-45e3-9fea-c3099a3fa4e1_1854x1016.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!2Ht_!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3dbc28da-6f75-45e3-9fea-c3099a3fa4e1_1854x1016.png" width="1456" height="798" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/3dbc28da-6f75-45e3-9fea-c3099a3fa4e1_1854x1016.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:798,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:243386,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://fullstackml.dev/i/182654507?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3dbc28da-6f75-45e3-9fea-c3099a3fa4e1_1854x1016.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!2Ht_!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3dbc28da-6f75-45e3-9fea-c3099a3fa4e1_1854x1016.png 424w, https://substackcdn.com/image/fetch/$s_!2Ht_!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3dbc28da-6f75-45e3-9fea-c3099a3fa4e1_1854x1016.png 848w, https://substackcdn.com/image/fetch/$s_!2Ht_!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3dbc28da-6f75-45e3-9fea-c3099a3fa4e1_1854x1016.png 1272w, https://substackcdn.com/image/fetch/$s_!2Ht_!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3dbc28da-6f75-45e3-9fea-c3099a3fa4e1_1854x1016.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Simulating full multi-turn conversations using Deep Eval</figcaption></figure></div><h2><strong>Step Two. Open Coding - Naming Failure Patterns</strong></h2><p>Imagine you&#8217;ve run 50 test conversations through <a href="https://github.com/jaeyow/multi_turn_chatbot_eval">JO&#8217;s Bike Shop Chatbot</a>. You have 50 JSON conversation traces. If you just glance at them, you&#8217;ll think, <em>&#8220;Yeah, it seems a bit buggy.&#8221;</em> But &#8220;a bit buggy&#8221; is not an engineering requirement. You can&#8217;t fix &#8220;buggy.&#8221;</p><p><strong>Open Coding</strong>&nbsp;is the process of reading those traces and assigning a short descriptive label to them. It has nothing to do with source code at all. This process is not something I have invented. In fact, it is a term adapted from&nbsp;<strong><a href="https://www.google.com/search?q=Grounded+Theory&amp;sca_esv=1760fc6ed6026b55&amp;sxsrf=AE3TifOTSjfEpEPTigq93W-CDBcTkkJERw%3A1766314754739&amp;ei=AtNHabrsLIacseMPtbfm-QU&amp;ved=2ahUKEwigno-Vx86RAxUle_UHHT_mC8oQgK4QegQIARAE&amp;uact=5&amp;oq=what+is+axial+coding+in++grounded+theory&amp;gs_lp=Egxnd3Mtd2l6LXNlcnAiKHdoYXQgaXMgYXhpYWwgY29kaW5nIGluICBncm91bmRlZCB0aGVvcnkyBRAAGIAEMgsQABiABBiGAxiKBTILEAAYgAQYhgMYigUyCBAAGIAEGKIESPk6UJYkWKg5cAJ4AZABAJgBwAGgAdMVqgEEMC4xN7gBA8gBAPgBAZgCC6AC8gvCAgoQABiwAxjWBBhHwgINEAAYgAQYsAMYQxiKBcICChAjGIAEGCcYigXCAgoQABiABBhDGIoFwgIGEAAYBxgewgILEAAYgAQYkQIYigXCAgUQABjvBZgDAIgGAZAGCpIHAzIuOaAHq1WyBwMwLjm4B-gLwgcFMC45LjLIBxmACAA&amp;sclient=gws-wiz-serp&amp;zx=1766315807409&amp;no_sw_cr=1&amp;mstk=AUtExfAudbE39XFLkaQDmx9hDecG7lMW-RTg8XSFfcbx5t1ACfPR-P5r9-i55Uz2Lawz51vNksQeLqYYgFHGlZEfuZsDZ078qPwhffgjmLPEONPeaolD_U9remeGrm2_pRre0QMrs2hcvGLFzCqgQt3gxAKnPwTQTI5eDcfepWa3XGUEli0Vj2phhSX0kszr2wuDqxquxQgsrivrXml9Ww6TQySRriXlLTqTCgVmF9g_M6ZQz_qILvZSlsZC2MHPUu9b_2ckxG0vw7eWyfsMtH6Jr2HaibycdnMjVp5j7X-JpPfCDQ&amp;csui=3">Grounded Theory</a></strong>, a research method in which you build a theory directly from the data collected. Those customer or user survey feedback forms that many providers ask you to fill in? Yes, understanding and finding insights from them are most definitely achieved by the same methods we use here.&nbsp;</p><h3>How I did it with the Bike Shop Traces</h3><p>I sat down with a croissant and a cup of coffee and opened both <a href="https://github.com/jaeyow/multi_turn_chatbot_eval/blob/main/error_analysis/single_turn_analysis.csv">single</a> and <a href="https://github.com/jaeyow/multi_turn_chatbot_eval/blob/main/error_analysis/multi_turn_analysis.csv">multi-turn traces</a>. Every time the bot said something that didn&#8217;t align with the shop&#8217;s &#8220;Source of Truth&#8221; (my fictional handbook), I assigned it a label.</p><p>Here is what that looked like in practice:</p><ul><li><p><strong>The Trace:</strong> User asks about e-bikes. Bot says, <em>&#8220;We do motor diagnostics!&#8221;</em></p></li><li><p><strong>My Label:</strong> User is asking a general question about e-bikes, but the bot is replying with a very specific answer about e-bike motor diagnostics.<br></p></li><li><p><strong>The Trace:</strong> User asks about Saturday hours. Bot says, <em>&#8220;We close at 5 PM.&#8221;</em></p></li><li><p><strong>My Label:</strong>&nbsp;The AI bot answers with a &#8220;common sense&#8221; closing time of 5PM; however, Saturday (or weekend) hours are different.<br></p></li><li><p><strong>The Trace:</strong> User says, <em>&#8220;I&#8217;ll come in at 9:30 AM.&#8221;</em> Bot says, <em>&#8220;See you then!&#8221;</em></p></li><li><p><strong>My Label:</strong>&nbsp;The AI Bot is too agreeable and always says yes by default, without taking into account the actual opening time on that day.</p></li></ul><h3>The &#8220;Aha!&#8221; Moment</h3><p>The magic of Open Coding is that you don&#8217;t start with a list of errors.&nbsp;<strong>You let the emerging error labels tell you what they are.</strong>&nbsp;In my first 10 traces, I noticed the bot kept telling people they didn&#8217;t need an appointment. I labelled them with similar descriptions. By the time I reached trace #15, I realised I had used that same label 6 times!</p><p>Suddenly, I wasn&#8217;t just &#8220;vibing&#8221; that the bot was wrong. I had&nbsp;<strong>data</strong>. I knew 40% of my failures stemmed from a failure mode in our booking policy. Fixing this particular failure mode can improve my chatbot by this much. </p><h3>Why this is a &#8220;Mindset Shift&#8221;</h3><p>For most of us, our instinct is to jump straight into the code and change the prompt. We see one error and we think, <em>&#8220;Oh, I&#8217;ll just tell the LLM &#8216;Don&#8217;t say we do motor diagnostics&#8217; in the system prompt.&#8221;</em></p><p><strong>Stop.</strong>&nbsp;That is &#8220;Whack-a-Mole&#8221; engineering. Open Coding forces you to slow down. It turns a messy pile of conversations into a structured list of&nbsp;<strong>Failure Modes</strong> as we move into <strong>Axial Coding</strong>, which is coming next.&nbsp;</p><h3>Do Open Coding yourself!</h3><p>If you&#8217;re doing this for the first time, don&#8217;t overthink the descriptive labels. Use plain English. I know it might be tempting to hand this activity off to a 3rd party or rely on a good LLM, which can certainly perform it in a blink of an eye.</p><p>However, the most value-adding activity you can do for your project is this part. It includes collecting traces, open coding (or adding descriptive labels), and finally axial coding, which we will discuss in the next section. Knowing how your application fails your users is the best way to improve your application!</p><h2><strong>Step Three: Axial Coding - Tidying the Notes into Categories</strong></h2><p>If <strong>Open Coding</strong> is about creating a pile of sticky notes, <strong>Axial Coding</strong> is simply the act of grouping those notes into categories.</p><p>Think of it like sorting your laundry. You have a messy pile of clothes (your descriptive labels), and you start putting them into baskets: &#8220;Whites,&#8221; &#8220;Darks,&#8221; and &#8220;Delicates.&#8221; In our case, the baskets are our <strong>Master Categories</strong>.</p><p>When I looked at my &#8220;sticky notes&#8221; for the Bike Shop, the categorisation became obvious:</p><ul><li><p><strong>The &#8220;Bad Data&#8221; Category:</strong> I put every note about incorrect closing times, prices, or service lists here.</p></li><li><p><strong>The &#8220;Policy Confusion&#8221; Category:</strong> I grouped all the notes where the bot gave incorrect advice about walk-ins vs. appointments, for example.</p></li><li><p><strong>The &#8220;Over-Eager AI&#8221; Category:</strong> I put the notes where the bot promised motor diagnostics or firmware updates, even though that was wrong.</p></li></ul><h3>Why is this good</h3><p>This three-step process -&nbsp;<strong>Collect</strong>,&nbsp;<strong>Describe, then Categorise</strong>&nbsp;- is the secret to moving beyond &#8220;vibes.&#8221;</p><p>Instead of telling your team,&nbsp;<em>&#8220;The bot feels a bit hallucinatory,&#8221;</em>&nbsp;you can show them a chart. You can say:&nbsp;<em>&#8220;We have 15 instances of &#8216;Bad Data&#8217; and 3 instances of &#8216;Over-Eager&#8217; behaviour. Our priority should be updating our Saturday hours in the knowledge base, as that&#8217;s where most of our &#8216;Bad Data&#8217; notes are coming from.&#8221;</em></p><p>You&#8217;ve turned a messy conversation into a concrete and actionable <strong>to-do list</strong> of tickets to improve our application.</p><div><hr></div><h3>Closing the Analysis Activity</h3><p>You&#8217;ve now mastered the&nbsp;<strong>Analysis</strong>&nbsp;phase of the AMI (Analyse, Measure, Improve) cycle. You know how to capture the &#8220;Black Box&#8221; recordings (<strong>Traces</strong>), describe the failures (<strong>Open Coding</strong>), and group them into actionable categories (<strong>Axial Coding</strong>).</p><p>But to be honest, doing this manually for every conversation isn&#8217;t scalable.</p><p>In&nbsp;<strong>Part 3</strong>, we&#8217;ll tackle the&nbsp;<strong>Measure</strong>&nbsp;phase. I&#8217;ll show you how to take the categories we just created and teach&nbsp;<em>another</em>&nbsp;LLM to recognise them. We&#8217;re going to build&nbsp;<strong>Automated Evaluators</strong>&nbsp;so your chatbot can be audited 24/7 without you needing a croissant or even a single cup of coffee!</p><p></p>]]></content:encoded></item><item><title><![CDATA[#11 - A Developer’s Guide to Failure Analysis and Evaluation for Single-turn and Multi-turn Chatbots - Part 1]]></title><description><![CDATA[In this article, we&#8217;ll look at how failures emerge in LLM chatbot conversations, why they&#8217;re hard to detect, how to systematically analyse them using real conversation traces - not just gut feel]]></description><link>https://www.fullstackml.dev/p/11-a-developers-guide-to-failure</link><guid isPermaLink="false">https://www.fullstackml.dev/p/11-a-developers-guide-to-failure</guid><dc:creator><![CDATA[JO Reyes]]></dc:creator><pubDate>Sun, 21 Dec 2025 22:00:33 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!kQsI!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F295ece92-4ac2-47a5-9990-66cd2c39fc1a_1024x1024.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Many developers, including myself in the past, often use a simple metric to judge the effectiveness of an LLM-powered chatbot:&nbsp;</p><p><em>&#8220;Does it answer the question?&#8221;</em>&nbsp;</p><p>However, good chatbots rarely involve a one-question, one-answer exchange. </p><p>Failures often don&#8217;t show up in a single turn (though they can still happen); instead, these errors tend to appear gradually, for example, in the 3rd, 4th, or 5th turn or later.&nbsp;</p><p>For this article, I have prepared a multi-turn chatbot called the JO&#8217;s Bike Shop Chatbot. Instead of the usual approach of putting everything into the prompt, I&#8217;ve opted for a graph-based chatbot using state machines, a common strategy for building chatbots that can be easier to adhere to your plans.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!kQsI!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F295ece92-4ac2-47a5-9990-66cd2c39fc1a_1024x1024.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!kQsI!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F295ece92-4ac2-47a5-9990-66cd2c39fc1a_1024x1024.png 424w, https://substackcdn.com/image/fetch/$s_!kQsI!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F295ece92-4ac2-47a5-9990-66cd2c39fc1a_1024x1024.png 848w, https://substackcdn.com/image/fetch/$s_!kQsI!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F295ece92-4ac2-47a5-9990-66cd2c39fc1a_1024x1024.png 1272w, https://substackcdn.com/image/fetch/$s_!kQsI!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F295ece92-4ac2-47a5-9990-66cd2c39fc1a_1024x1024.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!kQsI!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F295ece92-4ac2-47a5-9990-66cd2c39fc1a_1024x1024.png" width="1024" height="1024" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/295ece92-4ac2-47a5-9990-66cd2c39fc1a_1024x1024.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1024,&quot;width&quot;:1024,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1407808,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://fullstackml.dev/i/181128890?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F295ece92-4ac2-47a5-9990-66cd2c39fc1a_1024x1024.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!kQsI!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F295ece92-4ac2-47a5-9990-66cd2c39fc1a_1024x1024.png 424w, https://substackcdn.com/image/fetch/$s_!kQsI!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F295ece92-4ac2-47a5-9990-66cd2c39fc1a_1024x1024.png 848w, https://substackcdn.com/image/fetch/$s_!kQsI!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F295ece92-4ac2-47a5-9990-66cd2c39fc1a_1024x1024.png 1272w, https://substackcdn.com/image/fetch/$s_!kQsI!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F295ece92-4ac2-47a5-9990-66cd2c39fc1a_1024x1024.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p></p><p>Using graphs or state machines to build your LLM applications allows you to create applications that you can fully customise and control, rather than relying on the &#8220;prompt and pray' approach, where you put everything in the prompt and hope the LLM model will follow all your instructions.</p><p>Let&#8217;s go through a few examples of single-turn exchanges as well as how multiple-turn conversations can lead to subtle errors as the discussion progresses.</p><h2>Single-turn Examples: Subtle Failures</h2><p>Typical single-turn conversations can contain subtle failures, and none of these initially seem wrong. Sometimes, the AI nearly gets these failures right. In the future, such issues could compound in multi-turn conversations. </p><h3>Example 1: Over-generalisation of Services</h3><p><strong>User:</strong><br>Do you service e-bikes?</p><p><strong>LLM (subtly wrong):</strong><br>Yes, JO&#8217;s Bike Shop services most e-bikes, including motor diagnostics, battery checks, and firmware updates.</p><p><strong>Why this is a subtle failure</strong></p><ul><li><p>Sounds helpful and confident</p></li><li><p>Many bike shops <em>do</em> service e-bikes</p></li><li><p>But your fictional shop only:</p><ul><li><p>does <strong>basic mechanical work</strong> on e-bikes</p></li><li><p>does <strong>not</strong> perform motor diagnostics or firmware updates</p></li></ul></li></ul><h3>Example 2: Invented Policy (Hallucinated Detail)</h3><p><strong>User:</strong><br>Do you charge for bike safety inspections?</p><p><strong>LLM (almost right):</strong><br>Safety inspections at Jo&#8217;s Bike Shop are free if you&#8217;re getting another service done; otherwise, there&#8217;s a small inspection fee.</p><p><strong>Why is this subtle</strong></p><ul><li><p>This is a <em>very common</em> real-world policy</p></li><li><p>The answer hedges with &#8220;small fee&#8221;</p></li><li><p>But the shop&#8217;s actual policy is:</p><ul><li><p><strong>free safety inspections, always</strong></p></li></ul></li></ul><h3>Example 3: Shop Hours Assumption Error</h3><p><strong>User:</strong><br>What time do you close on Saturdays?</p><p><strong>LLM (plausible but wrong):</strong><br>Jo&#8217;s Bike Shop is usually open until 5pm on Saturdays.</p><p><strong>Why is this subtle</strong></p><ul><li><p>Uses &#8220;usually&#8221; to hedge</p></li><li><p>5 pm is a standard retail closing time</p></li><li><p>But the actual hours are:</p><ul><li><p><strong>Saturday: 9am&#8211;3pm</strong></p></li></ul></li></ul><h2>Multi-turn (5 turns) Example: Subtle Failures</h2><p><strong>Scenario</strong></p><p>A customer wants a <strong>tune-up</strong> and <strong>brake adjustment</strong>, but isn&#8217;t sure if they need to book ahead.</p><p>The chatbot starts off helpful&#8230; and then slowly drifts into misinformation about store policy and service availability.</p><h3>Multi-Turn Conversation + Hidden Failure Commentary</h3><h4><strong>Turn 1</strong></h4><p><strong>User:</strong><br><em>Hi, do you guys do regular bike tune-ups?</em></p><p><strong>LLM (correct):</strong><br>Absolutely! Jo&#8217;s Bike Shop offers a full standard tune-up, including gear indexing, brake alignment, wheel truing, safety checks, and general lubrication. I can help you figure out which service fits your bike.</p><p><strong>Notes:</strong><br>&#9989; Accurate. This matches the shop&#8217;s fictional service list.</p><div><hr></div><h4><strong>Turn 2</strong></h4><p><strong>User:</strong><br><em>Nice. Mine also has squeaky brakes. Do I need a separate service for that?</em></p><p><strong>LLM (still okay):</strong><br>Squeaky brakes are usually handled as part of a standard tune-up, though sometimes we may recommend a brake pad replacement if they&#8217;re worn. It depends on the inspection, but you typically don&#8217;t need a separate booking for squeaks alone.</p><p><strong>Notes:</strong><br>&#9989; Still good. Everything here is plausible and consistent.</p><div><hr></div><h4><strong>Turn 3</strong></h4><p><strong>User:</strong><br><em>Do I need to book ahead or can I just walk in?</em></p><p><strong>LLM (first subtle issue):</strong><br>For most tune-ups, walk-ins are totally fine. If the store is busy, you might need to leave the bike for a few hours, but generally you don&#8217;t need an appointment.</p><p><strong>Hidden problem:</strong><br>&#9940; <strong>Your actual shop policy (say, from your shop documents or website) says tune-ups </strong><em><strong>must be booked</strong></em><strong> because mechanics only take walk-ins for minor fixes.</strong><br>But the model has confidently stated the opposite.</p><p>This is something beginners won&#8217;t catch because it <em>sounds</em> correct.</p><div><hr></div><h4>Turn 4</h4><p><strong>User:</strong><br><em>Cool. I&#8217;d like to drop my bike in tomorrow morning then. Is that usually a quiet time?</em></p><p><strong>LLM (second subtle drift):</strong><br>Mornings are usually quieter, yes. Most people come in after work. If you bring the bike in around 9 or 10am, someone can take a look right away.</p><p><strong>Hidden problem:</strong><br>&#9940; Your fictional shop opens at <strong>10am</strong>.<br>&#9940; And mornings are actually your <strong>busiest</strong> intake period for our fictitiuos bike shop.</p><p>The model sounds confident (always does), but it has now invented:</p><ul><li><p>the wrong opening time</p></li><li><p>the wrong traffic pattern</p></li><li><p>incorrect service protocol</p></li></ul><p>All plausible. But all wrong.</p><div><hr></div><h4>Turn 5</h4><p><strong>User:</strong><br>Sweet. So I&#8217;ll be there tomorrow at 9:30am for a walk-in tune-up and brake inspection.</p><p><strong>LLM (final failure):</strong><br>Perfect! Just come by anytime between 9am&#8211;12pm and we&#8217;ll get you sorted. No need to book anything - we handle walk-ins right away.</p><p><strong>Culmination of subtle failures:</strong></p><ul><li><p>The model has <strong>fully committed</strong> to false store hours</p></li><li><p>It <strong>continues to reinforce</strong> the incorrect walk-in policy</p></li><li><p>It <strong>reassures the user</strong> that they&#8217;ll be seen immediately</p></li></ul><p>Nothing is obviously broken. Every answer sounds reasonable. But the chatbot has led the user to a&nbsp;<strong>poor customer experience</strong>, entirely due to minor inaccuracies that compounded across turns.</p><h2>Failure Analysis Primer</h2><p> In this section, we&#8217;ll introduce the concept of failure analysis to identify what went wrong in the LLM application. Most developers would stop at vibe checks or gut feelings. However, intuition alone does not scale. The more you have - more conversations, different versions of models, and teams, intuition becomes subjective, so each person or entity develops their own. It&#8217;s important to understand that LLM systems require explicit failure analysis, not just gut feel and intuition. </p><p>A course I recently attended was&nbsp;<a href="https://maven.com/parlance-labs/evals">'AI Evals For Engineers &amp; PMs'</a>, which first introduced me to the AMI Lifecycle for building Evaluation Systems.&nbsp;<strong>AMI stands</strong>&nbsp;for&nbsp;<strong>A</strong>nalyse,&nbsp;<strong>M</strong>easure and&nbsp;<strong>I</strong>mprove.&nbsp;</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!T4rG!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0259bf3e-3cf8-4480-8c3d-a910957e655f_604x457.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!T4rG!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0259bf3e-3cf8-4480-8c3d-a910957e655f_604x457.png 424w, https://substackcdn.com/image/fetch/$s_!T4rG!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0259bf3e-3cf8-4480-8c3d-a910957e655f_604x457.png 848w, https://substackcdn.com/image/fetch/$s_!T4rG!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0259bf3e-3cf8-4480-8c3d-a910957e655f_604x457.png 1272w, https://substackcdn.com/image/fetch/$s_!T4rG!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0259bf3e-3cf8-4480-8c3d-a910957e655f_604x457.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!T4rG!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0259bf3e-3cf8-4480-8c3d-a910957e655f_604x457.png" width="604" height="457" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/0259bf3e-3cf8-4480-8c3d-a910957e655f_604x457.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:457,&quot;width&quot;:604,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:12398,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://fullstackml.dev/i/181128890?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0259bf3e-3cf8-4480-8c3d-a910957e655f_604x457.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!T4rG!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0259bf3e-3cf8-4480-8c3d-a910957e655f_604x457.png 424w, https://substackcdn.com/image/fetch/$s_!T4rG!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0259bf3e-3cf8-4480-8c3d-a910957e655f_604x457.png 848w, https://substackcdn.com/image/fetch/$s_!T4rG!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0259bf3e-3cf8-4480-8c3d-a910957e655f_604x457.png 1272w, https://substackcdn.com/image/fetch/$s_!T4rG!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0259bf3e-3cf8-4480-8c3d-a910957e655f_604x457.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">The Analyse - Measure - Improve LLM Evaluation Lifecycle</figcaption></figure></div><p><strong>Analysis</strong>&nbsp;involves collecting representative samples, such as traces, to enable failure mode categorisation. Once the failure modes are classified and categorised, <strong>Measure</strong>&nbsp;follows, generating quantifiable metrics, estimating the propensity for errors, and guiding the developer in prioritising fixes.</p><p>Finally, we transition directly to the&nbsp;<strong>Improve</strong>&nbsp;cycle, where we can begin enhancing the system by updating prompts, improving pipeline components, and refining retrieval and/or fine-tuning models.</p><p><strong>Analysis</strong> &#8594; Gather Traces, <strong>Open Codin</strong>g and <strong>Axial Coding</strong>.</p><p><strong>Measure</strong> &#8594;  build evaluators for our LLM application </p><p><strong>Improve</strong> &#8594;  update and improve our LLM application</p><p>The AMI life cycle essentially creates a feedback loop that allows us to <strong>constantly enhance</strong> our LLM application.</p><h3><strong>Step One. Collecting Conversation Traces</strong></h3><p>A complete trace consists of a full single- or multi-turn conversation that includes inputs, outputs, tool calls, and metadata (such as prompts). An important distinction is that it includes the very first user input, through to the final response.</p><p>It can originate from real user interactions or be synthetically generated for testing. Synthetic traces are especially useful when real user data isn't yet available, enabling testing without requiring actual users to have used the applications.</p><p>They help cover edge cases, replicate failures, and eliminate user variability from analysis. These traces can be linked to specific examples, whether single-turn or multi-turn conversations.</p><h3><strong>Step Two. Open Coding - Naming Failure Patterns</strong></h3><p>The process of systematically reading and labelling these collected traces is called&nbsp;<em><strong>open coding</strong></em>, a term adapted from <strong><a href="https://www.google.com/search?q=Grounded+Theory&amp;sca_esv=1760fc6ed6026b55&amp;sxsrf=AE3TifOTSjfEpEPTigq93W-CDBcTkkJERw%3A1766314754739&amp;ei=AtNHabrsLIacseMPtbfm-QU&amp;ved=2ahUKEwigno-Vx86RAxUle_UHHT_mC8oQgK4QegQIARAE&amp;uact=5&amp;oq=what+is+axial+coding+in++grounded+theory&amp;gs_lp=Egxnd3Mtd2l6LXNlcnAiKHdoYXQgaXMgYXhpYWwgY29kaW5nIGluICBncm91bmRlZCB0aGVvcnkyBRAAGIAEMgsQABiABBiGAxiKBTILEAAYgAQYhgMYigUyCBAAGIAEGKIESPk6UJYkWKg5cAJ4AZABAJgBwAGgAdMVqgEEMC4xN7gBA8gBAPgBAZgCC6AC8gvCAgoQABiwAxjWBBhHwgINEAAYgAQYsAMYQxiKBcICChAjGIAEGCcYigXCAgoQABiABBhDGIoFwgIGEAAYBxgewgILEAAYgAQYkQIYigXCAgUQABjvBZgDAIgGAZAGCpIHAzIuOaAHq1WyBwMwLjm4B-gLwgcFMC45LjLIBxmACAA&amp;sclient=gws-wiz-serp&amp;zx=1766315807409&amp;no_sw_cr=1&amp;mstk=AUtExfAudbE39XFLkaQDmx9hDecG7lMW-RTg8XSFfcbx5t1ACfPR-P5r9-i55Uz2Lawz51vNksQeLqYYgFHGlZEfuZsDZ078qPwhffgjmLPEONPeaolD_U9remeGrm2_pRre0QMrs2hcvGLFzCqgQt3gxAKnPwTQTI5eDcfepWa3XGUEli0Vj2phhSX0kszr2wuDqxquxQgsrivrXml9Ww6TQySRriXlLTqTCgVmF9g_M6ZQz_qILvZSlsZC2MHPUu9b_2ckxG0vw7eWyfsMtH6Jr2HaibycdnMjVp5j7X-JpPfCDQ&amp;csui=3">Grounded Theory</a></strong>.</p><p>Grounded theory is a&nbsp;<strong>systematic qualitative research method</strong>&nbsp;in which you build a theory directly from data, rather than starting with a hypothesis. Instead of starting with a fixed list of imagined failure modes, we learn directly from the errors exhibited by the system and allow the failure mode patterns to emerge from the data. This enables one to focus on the actual problem at hand rather than the imagined problems we think the system has.</p><p>This represents a&nbsp;<strong>significant mindset shift</strong>&nbsp;for beginners.</p><h3><strong>Step Three: Axial Coding - Connecting the Dots</strong></h3><p>Axial coding is&nbsp;<strong>the second stage in&nbsp;<a href="https://www.google.com/search?q=Grounded+Theory&amp;sca_esv=1760fc6ed6026b55&amp;sxsrf=AE3TifOTSjfEpEPTigq93W-CDBcTkkJERw%3A1766314754739&amp;ei=AtNHabrsLIacseMPtbfm-QU&amp;ved=2ahUKEwigno-Vx86RAxUle_UHHT_mC8oQgK4QegQIARAE&amp;uact=5&amp;oq=what+is+axial+coding+in++grounded+theory&amp;gs_lp=Egxnd3Mtd2l6LXNlcnAiKHdoYXQgaXMgYXhpYWwgY29kaW5nIGluICBncm91bmRlZCB0aGVvcnkyBRAAGIAEMgsQABiABBiGAxiKBTILEAAYgAQYhgMYigUyCBAAGIAEGKIESPk6UJYkWKg5cAJ4AZABAJgBwAGgAdMVqgEEMC4xN7gBA8gBAPgBAZgCC6AC8gvCAgoQABiwAxjWBBhHwgINEAAYgAQYsAMYQxiKBcICChAjGIAEGCcYigXCAgoQABiABBhDGIoFwgIGEAAYBxgewgILEAAYgAQYkQIYigXCAgUQABjvBZgDAIgGAZAGCpIHAzIuOaAHq1WyBwMwLjm4B-gLwgcFMC45LjLIBxmACAA&amp;sclient=gws-wiz-serp&amp;zx=1766315807409&amp;no_sw_cr=1&amp;mstk=AUtExfAudbE39XFLkaQDmx9hDecG7lMW-RTg8XSFfcbx5t1ACfPR-P5r9-i55Uz2Lawz51vNksQeLqYYgFHGlZEfuZsDZ078qPwhffgjmLPEONPeaolD_U9remeGrm2_pRre0QMrs2hcvGLFzCqgQt3gxAKnPwTQTI5eDcfepWa3XGUEli0Vj2phhSX0kszr2wuDqxquxQgsrivrXml9Ww6TQySRriXlLTqTCgVmF9g_M6ZQz_qILvZSlsZC2MHPUu9b_2ckxG0vw7eWyfsMtH6Jr2HaibycdnMjVp5j7X-JpPfCDQ&amp;csui=3">Grounded Theory</a>,&nbsp;</strong>where researchers systematically link initial codes into broader categories, exploring relationships&nbsp;<em>around</em>&nbsp;a central phenomenon (the &#8220;axis&#8221;) to build a theory. This is essentially a more formal term for categorisation or grouping, and completing this will enable the team to finally understand where the error distributions in the problem space occur.&nbsp;</p><h2><strong>Introducing JO&#8217;s Bike Shop Chatbot</strong></h2><p>To illustrate the steps of the <strong>Analysis</strong> stage in the AMI feedback cycle, I have created a GitHub repo for <strong><a href="https://github.com/jaeyow/multi_turn_chatbot_eval">JO&#8217;s Bike Shop Chatbot</a>,  </strong>an Apache Burr-based chatbot application that we can use for single-turn and multiple-turn evaluation exercises. </p><p>This is the best place to pause and regroup. In the next article, we will continue and demonstrate how we performed open coding and axial coding for single and multiple-turn conversations. We continue with the <strong>Analysis</strong> phase, where we start to build our AI application evaluations and incrementally improve our chatbot.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!8UeX!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd4c7a4fd-12e5-4e8d-a114-bcd68dd0db59_2518x1562.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!8UeX!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd4c7a4fd-12e5-4e8d-a114-bcd68dd0db59_2518x1562.png 424w, https://substackcdn.com/image/fetch/$s_!8UeX!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd4c7a4fd-12e5-4e8d-a114-bcd68dd0db59_2518x1562.png 848w, https://substackcdn.com/image/fetch/$s_!8UeX!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd4c7a4fd-12e5-4e8d-a114-bcd68dd0db59_2518x1562.png 1272w, https://substackcdn.com/image/fetch/$s_!8UeX!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd4c7a4fd-12e5-4e8d-a114-bcd68dd0db59_2518x1562.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!8UeX!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd4c7a4fd-12e5-4e8d-a114-bcd68dd0db59_2518x1562.png" width="1456" height="903" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/d4c7a4fd-12e5-4e8d-a114-bcd68dd0db59_2518x1562.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:903,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;frontend.png&quot;,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="frontend.png" title="frontend.png" srcset="https://substackcdn.com/image/fetch/$s_!8UeX!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd4c7a4fd-12e5-4e8d-a114-bcd68dd0db59_2518x1562.png 424w, https://substackcdn.com/image/fetch/$s_!8UeX!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd4c7a4fd-12e5-4e8d-a114-bcd68dd0db59_2518x1562.png 848w, https://substackcdn.com/image/fetch/$s_!8UeX!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd4c7a4fd-12e5-4e8d-a114-bcd68dd0db59_2518x1562.png 1272w, https://substackcdn.com/image/fetch/$s_!8UeX!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd4c7a4fd-12e5-4e8d-a114-bcd68dd0db59_2518x1562.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p></p>]]></content:encoded></item><item><title><![CDATA[#10 - Evals for AI: Why Your Application Needs More Than a Vibe]]></title><description><![CDATA[Go beyond vibe checks to ensure your applications work the way you intended]]></description><link>https://www.fullstackml.dev/p/10-evals-for-ai-why-your-application</link><guid isPermaLink="false">https://www.fullstackml.dev/p/10-evals-for-ai-why-your-application</guid><dc:creator><![CDATA[JO Reyes]]></dc:creator><pubDate>Sat, 14 Jun 2025 02:02:09 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!JdY6!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F220ac9ab-d923-447a-b999-dba3b729832d_2048x2048.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div><hr></div><p>This article is part of a blog series about building effective evaluation systems for your AI application. Go beyond "vibe checks" and use a more systematic way to ensure your applications work as you designed them.</p><p><strong>Part 1 (this article)</strong>: Evaluations and your AI Application- This article introduces the concept of evaluations and explains how to approach this strategy for improving your applications.</p><div><hr></div><p>Hi friends,</p><p>This article will kick-start a brand new series, and this time, we will explore the world of generative AI. With my background in software engineering, I want to explore how to build robust and predictable AI systems. Let's see what value evaluations bring to our applications.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!JdY6!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F220ac9ab-d923-447a-b999-dba3b729832d_2048x2048.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!JdY6!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F220ac9ab-d923-447a-b999-dba3b729832d_2048x2048.png 424w, https://substackcdn.com/image/fetch/$s_!JdY6!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F220ac9ab-d923-447a-b999-dba3b729832d_2048x2048.png 848w, https://substackcdn.com/image/fetch/$s_!JdY6!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F220ac9ab-d923-447a-b999-dba3b729832d_2048x2048.png 1272w, https://substackcdn.com/image/fetch/$s_!JdY6!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F220ac9ab-d923-447a-b999-dba3b729832d_2048x2048.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!JdY6!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F220ac9ab-d923-447a-b999-dba3b729832d_2048x2048.png" width="1456" height="1456" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/220ac9ab-d923-447a-b999-dba3b729832d_2048x2048.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1456,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:4511918,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://fullstackml.dev/i/165908260?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F220ac9ab-d923-447a-b999-dba3b729832d_2048x2048.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!JdY6!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F220ac9ab-d923-447a-b999-dba3b729832d_2048x2048.png 424w, https://substackcdn.com/image/fetch/$s_!JdY6!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F220ac9ab-d923-447a-b999-dba3b729832d_2048x2048.png 848w, https://substackcdn.com/image/fetch/$s_!JdY6!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F220ac9ab-d923-447a-b999-dba3b729832d_2048x2048.png 1272w, https://substackcdn.com/image/fetch/$s_!JdY6!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F220ac9ab-d923-447a-b999-dba3b729832d_2048x2048.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p></p><h2>Building AI Applications from First Principles</h2><p>To many software developers, it is second nature to write unit tests, and your mileage may vary here. Some of us are so hung up on code coverage, but for me, pragmatism is key, where we build tests that add value and make sense.</p><p>Building AI applications is not significantly different from creating any other software application. While there is still value in writing unit and integration tests, developing applications based on these probabilistic models requires much more than that. Where we would have a definitive right or wrong outcome for traditional software projects, LLM-based applications necessitate a different approach.</p><h2>In-Context Learning</h2><p>The two main ways of developing LLM applications these days are fine-tuning AI models, which can help a lot in some specific instances; however, many in the know will first reach out to&nbsp;<a href="https://www.lakera.ai/blog/what-is-in-context-learning">in-context learning</a>&nbsp;and prompt engineering because they are much cheaper and just work.</p><p>We all know what prompt engineering involves, where we write instructions or context for the LLM to help it carry out its tasks. ICL, on the other hand, is a part of prompt engineering so that we are giving more detailed information, or overloading the prompt with other related data such as guidelines, background, examples and similar topics, such as what we do in Retrieval Augmented Generation (RAG) systems. </p><h2>AI Applications and the Levers You Can Pull</h2><p>Building AI applications is an iterative process, like building any software system. For example, when writing software, we typically build it in stages and layer functionality on top of features as we go. We also generally make sure that the functionality is sound first before optimising it. <br><br>With AI applications, which in this case are systems built on top of LLMs, in-context learning and prompt engineering, improving means we have to reach into and strengthen our ICL and prompt engineering techniques and the choice of language and embedding models it uses. </p><h2>Vibe checking can only go so far.</h2><p>Vibe checks are a method commonly used to build such systems. We want to answer the question, &#8220;Does the system&#8217;s responses feel right?&#8220;And yes, I use them all the time, and you should, too, as the primary way of ensuring your application works as you have designed it.</p><p>An obvious way to improve your application is to swap your LLM with a better one. These days, everyone knows that there&#8217;s a new and better model every week, and it feels like every other day. But even though the LLM developer claims that these are &#8220;better&#8220; than previous models and many competitors&#8217; models, this does not mean anything to you and your application until you test it <strong>in your application</strong>.</p><p>Yes, we reach out to vibe checks, and you should continue to do this, because people are the main users of your application. But when your requirements grow and your application becomes more complex, you will have to do more and more of these vibe checks.</p><p>For example, you may start with a handful of five checks, which you do every time your system changes. What do you do when you have to expand this to ten, twenty, fifty? How about a hundred items in your vibe checks checklist? Yes, this is not sustainable at all. </p><h2>Evaluation systems are the answer.</h2><p>This series of articles will discuss what evaluation systems are, why we need them, and, more importantly, how to approach building evaluations for your application.</p><p>Last month, I participated in a program to enhance my AI Engineering skills. With my background in software engineering, I encountered significant challenges when building LLM applications: specifically, the difficulty in creating robust and trustworthy systems. This is mainly due to the inherently uncertain and non-deterministic nature of the LLMs themselves.<br><br>The program, <a href="https://maven.com/hugo-stefan/building-llm-apps-ds-and-swe-from-first-principles">Building LLM Applications for Data Scientists and Software Engineers</a>, led by <a href="https://www.linkedin.com/in/hugo-bowne-anderson-045939a5/">Hugo Bowne-Anderson</a> (one of my favorite online educators) and <a href="https://www.linkedin.com/in/skrawczyk/">Stefan Krawczyk</a> (creator of the open-source libraries Hamilton and Burr), proved to be incredibly valuable and focused. </p><p>See you in the following article!</p>]]></content:encoded></item><item><title><![CDATA[#9 - Putting Our Bird Embedding Model to Work: Introducing the Web Frontend]]></title><description><![CDATA[To make our embedding model accessible and useful to a wider audience, we developed a modern front-end interface for similarity inference.]]></description><link>https://www.fullstackml.dev/p/9-putting-our-bird-embedding-model</link><guid isPermaLink="false">https://www.fullstackml.dev/p/9-putting-our-bird-embedding-model</guid><dc:creator><![CDATA[JO Reyes]]></dc:creator><pubDate>Sat, 05 Apr 2025 11:20:41 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff9fdf612-b035-48cb-9838-3f31f845d660_3016x1880.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>This article is <strong>part of a blog series about demistifying vector embedding models</strong> for use in image search use case:</p><p><a href="https://fullstackml.dev/p/4-so-how-do-you-build-an-image-embedding">Part 1. So how do you build a vector embedding model?</a> - Introduces vector embedding models and the intuition behind the technologies we can use to build one ourselves.<br><br><a href="https://fullstackml.dev/p/5-lets-build-our-embedding-model">Part 2. Let's build our image embedding model</a> - Shows a couple of ways to build embedding models - first by using a <strong>pre-trained model</strong>, and next by <strong>fine-tuning</strong> a pre-trained model. We use <strong>PyTorch</strong> to build our feature extractor.</p><p><a href="https://fullstackml.dev/p/6-using-metaflow-and-mlflow">Part 3. Modelling with Metaflow and MLFlow</a> - Here we are using Metaflow to build our model training workflow, where we introduce the concept of checkpointing, and MLFlow for experiment tracking.</p><p><a href="https://fullstackml.dev/p/8-from-training-to-deployment-a-simple">Part 4. From Training to Deployment: A Simple Approach to Serving Embedding Models</a> -Packaging your ML model in a Docker container opens it up to a multitude of model serving options.</p><p>Part 5. Putting Our Bird Embedding Model to Work: Introducing the Web Frontend - (this article) -For our embedding model to prove useful to others, we have created a modern frontend to serve the similarity inference to our users.</p><div><hr></div><p>Hi friends,</p><p>In this final installment (hopefully!) of our project, we've reached the critical point where theory meets application.  Only when our app users can directly interact with our <strong>embedding-model-powered similarity system</strong> will they fully appreciate its value.</p><p>To recap, in our first article, we introduced embedding models and then explored how to build them using <strong>PyTorch</strong>.  We subsequently utilised tools like Metaflow and MLflow to streamline our MLOps process.  Following that, we discussed how to containerise the model for efficient serving.  </p><p>However, the true impact of these steps becomes evident only through visual, side-by-side comparisons.  The challenge of replicating human-like visual comparison, identifying similarities and differences between bird images, is addressed through machine learning, as demonstrated in our previous articles.  </p><p>Therefore, this article will shift its focus from technical details to a practical demonstration of the front-end interface, showcasing the embedding model's capabilities through bird image similarity searches.<br></p><h2>The Bird Similarity Search System: An Overview</h2><p><br>Before proceeding, let me visually summarise what we have accomplished. The lower left box illustrates the process we undertook to <strong>build the bird embedding model</strong> using PyTorch, Metaflow, and MLflow. </p><p>Following the creation of the embedding model, as shown in the lower right box, we utilised FastAPI and Docker to prepare our container for model serving. To enable image semantic search functionality, we selected LanceDB as our embedded vector database, facilitating rapid similarity searches.</p><p>Finally, the upper right box depicts the deployment-ready front-end, developed with <a href="https://react.dev/">ReactJS</a> (<a href="https://vite.dev/">Vite</a>) and <a href="https://www.docker.com/">Docker</a>, designed to showcase the system's capabilities to our users.<br></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!azqR!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F825fae65-5593-4709-a293-aa043a59850f_1733x879.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!azqR!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F825fae65-5593-4709-a293-aa043a59850f_1733x879.png 424w, https://substackcdn.com/image/fetch/$s_!azqR!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F825fae65-5593-4709-a293-aa043a59850f_1733x879.png 848w, https://substackcdn.com/image/fetch/$s_!azqR!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F825fae65-5593-4709-a293-aa043a59850f_1733x879.png 1272w, https://substackcdn.com/image/fetch/$s_!azqR!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F825fae65-5593-4709-a293-aa043a59850f_1733x879.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!azqR!,w_2400,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F825fae65-5593-4709-a293-aa043a59850f_1733x879.png" width="1200" height="609.065934065934" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/825fae65-5593-4709-a293-aa043a59850f_1733x879.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:false,&quot;imageSize&quot;:&quot;large&quot;,&quot;height&quot;:739,&quot;width&quot;:1456,&quot;resizeWidth&quot;:1200,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-large" alt="" srcset="https://substackcdn.com/image/fetch/$s_!azqR!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F825fae65-5593-4709-a293-aa043a59850f_1733x879.png 424w, https://substackcdn.com/image/fetch/$s_!azqR!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F825fae65-5593-4709-a293-aa043a59850f_1733x879.png 848w, https://substackcdn.com/image/fetch/$s_!azqR!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F825fae65-5593-4709-a293-aa043a59850f_1733x879.png 1272w, https://substackcdn.com/image/fetch/$s_!azqR!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F825fae65-5593-4709-a293-aa043a59850f_1733x879.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h2>Search 1 - The Philippine Eagle</h2><p>I was born in <a href="https://en.wikipedia.org/wiki/Cebu">Cebu</a>, an island in the middle of the <a href="https://en.wikipedia.org/wiki/Philippines">Philippines</a>, and even though I now call Australia home, I am still a proud Filipino. So the only bird that should go first is the <a href="https://en.wikipedia.org/wiki/Philippine_eagle">Philippine Eagle</a>. </p><p><strong>Appearance</strong>: This bird has a very distinctive, almost comical appearance. It has a prominent, spiky crest of feathers on its head. Its face is pale with dark markings around the eyes. The beak is relatively short and thick. The body has a a mix of brown and white feathers.</p><p>The application displays the top 8 matches, it could have returned more, however, we wanted to display the results nice and even on the screen. After uploading the reference image, it takes a split-second for the results to show. </p><p>Notice that it was able to identify with 87.94% certatinty that the reference image is a Philippine Eagle, and the next most similar bird in the database is a low 38.73%. There will be a little bit of a delay if we deploy this to the cloud, however, this is much better in latancy and capability that any generic LLM out there. Just to show that <a href="https://en.wikipedia.org/wiki/Generative_artificial_intelligence">Generative AI</a>, as great as it is, is never the answer to all use cases out there. </p><p></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!AC0z!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff9fdf612-b035-48cb-9838-3f31f845d660_3016x1880.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!AC0z!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff9fdf612-b035-48cb-9838-3f31f845d660_3016x1880.png 424w, https://substackcdn.com/image/fetch/$s_!AC0z!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff9fdf612-b035-48cb-9838-3f31f845d660_3016x1880.png 848w, https://substackcdn.com/image/fetch/$s_!AC0z!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff9fdf612-b035-48cb-9838-3f31f845d660_3016x1880.png 1272w, https://substackcdn.com/image/fetch/$s_!AC0z!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff9fdf612-b035-48cb-9838-3f31f845d660_3016x1880.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!AC0z!,w_2400,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff9fdf612-b035-48cb-9838-3f31f845d660_3016x1880.png" width="1200" height="748.3516483516484" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/f9fdf612-b035-48cb-9838-3f31f845d660_3016x1880.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:false,&quot;imageSize&quot;:&quot;large&quot;,&quot;height&quot;:908,&quot;width&quot;:1456,&quot;resizeWidth&quot;:1200,&quot;bytes&quot;:2833769,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://fullstackml.dev/i/159599559?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff9fdf612-b035-48cb-9838-3f31f845d660_3016x1880.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-large" alt="" srcset="https://substackcdn.com/image/fetch/$s_!AC0z!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff9fdf612-b035-48cb-9838-3f31f845d660_3016x1880.png 424w, https://substackcdn.com/image/fetch/$s_!AC0z!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff9fdf612-b035-48cb-9838-3f31f845d660_3016x1880.png 848w, https://substackcdn.com/image/fetch/$s_!AC0z!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff9fdf612-b035-48cb-9838-3f31f845d660_3016x1880.png 1272w, https://substackcdn.com/image/fetch/$s_!AC0z!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff9fdf612-b035-48cb-9838-3f31f845d660_3016x1880.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>For example, I showed this image to both <a href="https://gemini.google.com/app?hl=en-AU">Gemini</a> and <a href="https://openai.com/index/chatgpt/">ChatGPT</a>, with mixed results. While Gemini was able to identify the bird and suggest other similar birds, and although it also took less than a second, it was only able to give me text descriptions, I know its unfair to Gemini, as its a language model after all. But it&#8217;s multi-modal, so it should have been able to show me pictures too.</p><p>ChatGPT did not do too well either, though it did try to generate images of similar birds, it got stuck and produced an error generating the images after trying for a minute. </p><h2>Search 2 - Parus Major</h2><p><strong>Appearance:</strong> This is a <strong>Great Tit</strong> (<em>Parus major</em>). It has a black cap, white cheeks, a yellow breast with a black stripe down the center, a greyish back, and a dark beak. It's a small passerine bird.</p><p>Despite our vector database containing only 50 bird images, it effectively demonstrates the similarity search process. By storing the bird images' vector embeddings in <a href="https://lancedb.com/">LanceDB</a>, we can perform a similarity search by inputting a bird image.</p><p>Upon upload via FastAPI, the image is converted into a vector embedding, which is then submitted as a query to LanceDB. Due to the absence of a specified index, an exact search is performed exhaustively (brute-force) within a fraction of a second. </p><p>For databases containing millions of images, indexing is necessary to enable efficient nearest-neighbour searches, which scale to larger datasets but may result in a lower recall.</p><p></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!xCx7!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffaa4812e-c0da-45bb-8d2d-0f49cf0c5320_3008x1874.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!xCx7!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffaa4812e-c0da-45bb-8d2d-0f49cf0c5320_3008x1874.png 424w, https://substackcdn.com/image/fetch/$s_!xCx7!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffaa4812e-c0da-45bb-8d2d-0f49cf0c5320_3008x1874.png 848w, https://substackcdn.com/image/fetch/$s_!xCx7!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffaa4812e-c0da-45bb-8d2d-0f49cf0c5320_3008x1874.png 1272w, https://substackcdn.com/image/fetch/$s_!xCx7!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffaa4812e-c0da-45bb-8d2d-0f49cf0c5320_3008x1874.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!xCx7!,w_2400,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffaa4812e-c0da-45bb-8d2d-0f49cf0c5320_3008x1874.png" width="1200" height="747.5274725274726" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/faa4812e-c0da-45bb-8d2d-0f49cf0c5320_3008x1874.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:false,&quot;imageSize&quot;:&quot;large&quot;,&quot;height&quot;:907,&quot;width&quot;:1456,&quot;resizeWidth&quot;:1200,&quot;bytes&quot;:2764780,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://fullstackml.dev/i/159599559?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffaa4812e-c0da-45bb-8d2d-0f49cf0c5320_3008x1874.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-large" alt="" srcset="https://substackcdn.com/image/fetch/$s_!xCx7!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffaa4812e-c0da-45bb-8d2d-0f49cf0c5320_3008x1874.png 424w, https://substackcdn.com/image/fetch/$s_!xCx7!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffaa4812e-c0da-45bb-8d2d-0f49cf0c5320_3008x1874.png 848w, https://substackcdn.com/image/fetch/$s_!xCx7!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffaa4812e-c0da-45bb-8d2d-0f49cf0c5320_3008x1874.png 1272w, https://substackcdn.com/image/fetch/$s_!xCx7!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffaa4812e-c0da-45bb-8d2d-0f49cf0c5320_3008x1874.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h2>Search 3 - Java Sparrow</h2><p><strong>Appearance:</strong> This is a <strong>Java Sparrow</strong> (<em>Lonchura oryzivora</em>). It has a distinctive pink beak, a grey head, a pinkish cheek patch, a grey body, and a black tail. It's a small finch-like bird.</p><p>Observing both the Java Sparrow and the previously described Great Tit (Parus major), one might notice superficial similarities, such as the white cheek patches and grayish body coloration. However, these are where the similarities end.</p><p>It is noteworthy that the application accurately reflects this distinction. While it identifies both birds as among the most similar, the similarity scores clearly indicate a significant difference. The Java Sparrow is identified with a 96.98% similarity, whereas the Great Tit registers only 47.51%.</p><p>This effectively demonstrates that, despite some shared visual features, the application correctly recognises the Great Tit as distinctly different from the reference Java Sparrow.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!mTIJ!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1abdd8b8-a375-4ffa-b650-1bcbd49ad3b2_3012x1878.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!mTIJ!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1abdd8b8-a375-4ffa-b650-1bcbd49ad3b2_3012x1878.png 424w, https://substackcdn.com/image/fetch/$s_!mTIJ!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1abdd8b8-a375-4ffa-b650-1bcbd49ad3b2_3012x1878.png 848w, https://substackcdn.com/image/fetch/$s_!mTIJ!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1abdd8b8-a375-4ffa-b650-1bcbd49ad3b2_3012x1878.png 1272w, https://substackcdn.com/image/fetch/$s_!mTIJ!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1abdd8b8-a375-4ffa-b650-1bcbd49ad3b2_3012x1878.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!mTIJ!,w_2400,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1abdd8b8-a375-4ffa-b650-1bcbd49ad3b2_3012x1878.png" width="1200" height="748.3516483516484" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/1abdd8b8-a375-4ffa-b650-1bcbd49ad3b2_3012x1878.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:false,&quot;imageSize&quot;:&quot;large&quot;,&quot;height&quot;:908,&quot;width&quot;:1456,&quot;resizeWidth&quot;:1200,&quot;bytes&quot;:2407511,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://fullstackml.dev/i/159599559?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1abdd8b8-a375-4ffa-b650-1bcbd49ad3b2_3012x1878.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-large" alt="" srcset="https://substackcdn.com/image/fetch/$s_!mTIJ!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1abdd8b8-a375-4ffa-b650-1bcbd49ad3b2_3012x1878.png 424w, https://substackcdn.com/image/fetch/$s_!mTIJ!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1abdd8b8-a375-4ffa-b650-1bcbd49ad3b2_3012x1878.png 848w, https://substackcdn.com/image/fetch/$s_!mTIJ!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1abdd8b8-a375-4ffa-b650-1bcbd49ad3b2_3012x1878.png 1272w, https://substackcdn.com/image/fetch/$s_!mTIJ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1abdd8b8-a375-4ffa-b650-1bcbd49ad3b2_3012x1878.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h2>Search 4 - Scarlet Tanager</h2><p><strong>Appearance:</strong> This is a <strong>Scarlet Tanager</strong> (<em>Piranga olivacea</em>). It has a bright red body, black wings and tail, and a thick, pale beak. It's a small songbird.</p><p>As with search #3, the application successfully identified and differentiated the Bornean Bristlehead as the most similar bird in the database. Beyond the shared red and black coloration, these two species exhibit significant differences.</p><p>The Scarlet Tanager is a small bird endemic to North and South America, while the Bornean Bristlehead is a medium-sized bird native to Borneo. Fortunately, neither species is endangered, and both are widespread within their respective habitats.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!8rAZ!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe0303d8b-859a-438f-8829-48bfdbf1c97d_3014x1880.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!8rAZ!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe0303d8b-859a-438f-8829-48bfdbf1c97d_3014x1880.png 424w, https://substackcdn.com/image/fetch/$s_!8rAZ!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe0303d8b-859a-438f-8829-48bfdbf1c97d_3014x1880.png 848w, https://substackcdn.com/image/fetch/$s_!8rAZ!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe0303d8b-859a-438f-8829-48bfdbf1c97d_3014x1880.png 1272w, https://substackcdn.com/image/fetch/$s_!8rAZ!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe0303d8b-859a-438f-8829-48bfdbf1c97d_3014x1880.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!8rAZ!,w_2400,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe0303d8b-859a-438f-8829-48bfdbf1c97d_3014x1880.png" width="1200" height="748.3516483516484" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/e0303d8b-859a-438f-8829-48bfdbf1c97d_3014x1880.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:false,&quot;imageSize&quot;:&quot;large&quot;,&quot;height&quot;:908,&quot;width&quot;:1456,&quot;resizeWidth&quot;:1200,&quot;bytes&quot;:2598983,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://fullstackml.dev/i/159599559?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe0303d8b-859a-438f-8829-48bfdbf1c97d_3014x1880.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-large" alt="" srcset="https://substackcdn.com/image/fetch/$s_!8rAZ!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe0303d8b-859a-438f-8829-48bfdbf1c97d_3014x1880.png 424w, https://substackcdn.com/image/fetch/$s_!8rAZ!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe0303d8b-859a-438f-8829-48bfdbf1c97d_3014x1880.png 848w, https://substackcdn.com/image/fetch/$s_!8rAZ!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe0303d8b-859a-438f-8829-48bfdbf1c97d_3014x1880.png 1272w, https://substackcdn.com/image/fetch/$s_!8rAZ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe0303d8b-859a-438f-8829-48bfdbf1c97d_3014x1880.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h2>API Implementation for the Front End</h2><p>As detailed in the previous article, we employed FastAPI to serve our similarity search system, which comprises the embedding model and LanceDB, our vector database. A notable feature of FastAPI is its built-in support for generating OpenAPI (formerly Swagger) documentation, as illustrated below. You have likely encountered this type of API documentation previously.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!1ok8!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0445c53d-28f9-4010-83aa-e1e3ea3aeaf7_1546x1642.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!1ok8!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0445c53d-28f9-4010-83aa-e1e3ea3aeaf7_1546x1642.png 424w, https://substackcdn.com/image/fetch/$s_!1ok8!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0445c53d-28f9-4010-83aa-e1e3ea3aeaf7_1546x1642.png 848w, https://substackcdn.com/image/fetch/$s_!1ok8!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0445c53d-28f9-4010-83aa-e1e3ea3aeaf7_1546x1642.png 1272w, https://substackcdn.com/image/fetch/$s_!1ok8!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0445c53d-28f9-4010-83aa-e1e3ea3aeaf7_1546x1642.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!1ok8!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0445c53d-28f9-4010-83aa-e1e3ea3aeaf7_1546x1642.png" width="728" height="773" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/0445c53d-28f9-4010-83aa-e1e3ea3aeaf7_1546x1642.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:false,&quot;imageSize&quot;:&quot;normal&quot;,&quot;height&quot;:1546,&quot;width&quot;:1456,&quot;resizeWidth&quot;:728,&quot;bytes&quot;:211622,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://fullstackml.dev/i/159599559?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0445c53d-28f9-4010-83aa-e1e3ea3aeaf7_1546x1642.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!1ok8!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0445c53d-28f9-4010-83aa-e1e3ea3aeaf7_1546x1642.png 424w, https://substackcdn.com/image/fetch/$s_!1ok8!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0445c53d-28f9-4010-83aa-e1e3ea3aeaf7_1546x1642.png 848w, https://substackcdn.com/image/fetch/$s_!1ok8!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0445c53d-28f9-4010-83aa-e1e3ea3aeaf7_1546x1642.png 1272w, https://substackcdn.com/image/fetch/$s_!1ok8!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0445c53d-28f9-4010-83aa-e1e3ea3aeaf7_1546x1642.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>The OpenAPI interface provides an interactive environment for testing the API. Users can upload bird images directly through the documentation to experience the system's functionality firsthand. Please try it!</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!nmNR!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fac128afa-da29-4fb7-88b3-8579381c920d_1716x1796.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!nmNR!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fac128afa-da29-4fb7-88b3-8579381c920d_1716x1796.png 424w, https://substackcdn.com/image/fetch/$s_!nmNR!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fac128afa-da29-4fb7-88b3-8579381c920d_1716x1796.png 848w, https://substackcdn.com/image/fetch/$s_!nmNR!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fac128afa-da29-4fb7-88b3-8579381c920d_1716x1796.png 1272w, https://substackcdn.com/image/fetch/$s_!nmNR!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fac128afa-da29-4fb7-88b3-8579381c920d_1716x1796.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!nmNR!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fac128afa-da29-4fb7-88b3-8579381c920d_1716x1796.png" width="1456" height="1524" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/ac128afa-da29-4fb7-88b3-8579381c920d_1716x1796.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1524,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:370066,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://fullstackml.dev/i/159599559?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fac128afa-da29-4fb7-88b3-8579381c920d_1716x1796.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!nmNR!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fac128afa-da29-4fb7-88b3-8579381c920d_1716x1796.png 424w, https://substackcdn.com/image/fetch/$s_!nmNR!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fac128afa-da29-4fb7-88b3-8579381c920d_1716x1796.png 848w, https://substackcdn.com/image/fetch/$s_!nmNR!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fac128afa-da29-4fb7-88b3-8579381c920d_1716x1796.png 1272w, https://substackcdn.com/image/fetch/$s_!nmNR!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fac128afa-da29-4fb7-88b3-8579381c920d_1716x1796.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h2>Local System Execution</h2><p>Once the fine-tuned embedding model and the bird embeddings are ingested into LanceDB, executing the complete ReactJS front-end and FastAPI back-end locally is achieved through the following command:</p><pre><code>docker compose up --build</code></pre><p>This command initiates the build and execution of the Docker Compose network, deploying both the front-end and back-end services, thereby providing the complete system as detailed in this article.</p><h2>Parting Words</h2><p>Building this application has been a rewarding experience, initially intended as an educational endeavor for you, my readers and friends. However, as I've often found, these projects invariably benefit myself as well. </p><p>On numerous occasions, I've revisited my past articles, years after publication, and been surprised by the refreshed understanding I gain.  It's a familiar feeling of learning and comprehension, distilled from words and images within a simple blog post.</p><p>I hope this project has sparked your interest in the possibilities of how computers can interpret images.  It would be wonderful if you felt encouraged to explore these concepts further and perhaps even apply them in your own projects. </p><p>My aim was to illustrate the potential of machine learning for visual analysis and similarity search, and I hope this has motivated you to continue learning and experimenting in this exciting field.</p><h2>Why don&#8217;t you give it a go!</h2><p>You can find all the code for <a href="https://github.com/jaeyow/bird-species-embedding-model">this project on GitHub</a>. Feel free to check it out and try it for yourself &#8211; it's all open source!</p><p>If you have any questions, drop a comment here on Substack, or send me <a href="http://linkedin.com/in/joreyes">a DM on LinkedIn</a>.</p><p>And hey, if you're interested in collaborating on something cool, let's chat! I'm always up for new ideas and working with other passionate people.</p>]]></content:encoded></item><item><title><![CDATA[#8 - From Training to Deployment: A Simple Approach to Serving Embedding Models]]></title><description><![CDATA[Now that our MLOps pipeline has trained and built our embedding model, it is now time to serve the model in a Docker container using FastAPI]]></description><link>https://www.fullstackml.dev/p/8-from-training-to-deployment-a-simple</link><guid isPermaLink="false">https://www.fullstackml.dev/p/8-from-training-to-deployment-a-simple</guid><dc:creator><![CDATA[JO Reyes]]></dc:creator><pubDate>Sat, 22 Mar 2025 00:39:03 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!azqR!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F825fae65-5593-4709-a293-aa043a59850f_1733x879.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>This article is <strong>part of a blog series about demistifying vector embedding models</strong> for use in image search use case:</p><p><a href="https://fullstackml.dev/p/4-so-how-do-you-build-an-image-embedding">Part 1. So how do you build a vector embedding model?</a> - Introduces vector embedding models and the intuition behind the technologies we can use to build one ourselves.<br><br><a href="https://fullstackml.dev/p/5-lets-build-our-embedding-model">Part 2. Let's build our image embedding model</a> - Shows a couple of ways to build embedding models - first by using a <strong>pre-trained model</strong>, and next by <strong>fine-tuning</strong> a pre-trained model. We use PyTorch to build our feature extractor.</p><p><a href="https://fullstackml.dev/p/6-using-metaflow-and-mlflow">Part 3. Modelling with Metaflow and MLFlow</a> - Here we are using Metaflow to build our model training workflow, where we introduce the concept of checkpointing, and MLFlow for experiment tracking.</p><p>Part 4. From Training to Deployment: A Simple Approach to Serving Embedding Models - (this article) -Packaging your ML model in a Docker container opens it up to a multitude of model serving options.</p><p><a href="https://fullstackml.dev/p/9-putting-our-bird-embedding-model">Part 5. Putting Our Bird Embedding Model to Work: Introducing the Web Frontend</a> -For our embedding model to prove useful to others, we have created a modern frontend to serve the similarity inference to our users.</p><div><hr></div><p>Hi friends,</p><p>In our last few posts, we talked about embedding models and showed you how to fine-tune a pre-trained <a href="https://en.wikipedia.org/wiki/Residual_neural_network">ResNet model</a> using <a href="https://pytorch.org/">PyTorch</a>. We also went over some MLOps ideas to set up a workflow for training this model consistently.</p><p>Now, we want to take things a step further and actually <em>use</em> this model in a way that's helpful for others, not just me. So, we're going to build a system around it. This means creating an API to let other programs talk to our model, and a nice, modern website for people to interact with it directly.</p><p><strong>Getting Started: The API</strong></p><p><a href="https://fastapi.tiangolo.com/">FastAPI</a> is a really popular tool for building APIs with Python. It's known for being super fast, which makes it a great choice for serving machine learning models. Plus, it has a bunch of other cool stuff built-in, like built-in validations, to make sure the data sent to the API is in the right format.</p><p>It's built on top of <a href="https://docs.pydantic.dev/latest/">Pydantic</a>, which is this incredibly popular library for checking data. Seriously, it gets downloaded like 272 million times every month! That tells you how much developers trust it and find it useful. Pydantic has tons of features, but the ones I find most handy are: making sure the data is valid, managing settings easily, and turning Python stuff into formats that can be sent over the internet (serialisation).</p><p><strong>Making Things Easy with Docker</strong></p><p><a href="https://www.docker.com/">Docker</a>... well, everyone knows Docker, right? For me, the biggest thing is that it makes life so much easier for me and my team. It gives us the best developer experience (DX) possible. With Docker, we can be pretty confident that if our API runs inside the container on our computers, it's going to run the same way when we put it in the cloud. No more "it works on my machine!" surprises. We also use <a href="https://docs.docker.com/compose/">Docker Compose</a> to not just build the container for the API, but also to set up the whole system around it, like the website that will actually use the API.</p><p><strong>The Power of Vector Databases</strong></p><p>Finally, the real magic behind finding similar things in a project like this is the "semantic search" power of vector databases. In this project, we're using <a href="https://lancedb.github.io/lancedb/examples/examples_python/">LanceDB</a>, which is a vector database that keeps all its data in a file. It's open-source, has a lot of cool features, and the best part? It's super easy to get started with. Remember that feeling the first time you used SQLite? You get the same feeling using LanceDB. </p><p></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!azqR!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F825fae65-5593-4709-a293-aa043a59850f_1733x879.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!azqR!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F825fae65-5593-4709-a293-aa043a59850f_1733x879.png 424w, https://substackcdn.com/image/fetch/$s_!azqR!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F825fae65-5593-4709-a293-aa043a59850f_1733x879.png 848w, https://substackcdn.com/image/fetch/$s_!azqR!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F825fae65-5593-4709-a293-aa043a59850f_1733x879.png 1272w, https://substackcdn.com/image/fetch/$s_!azqR!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F825fae65-5593-4709-a293-aa043a59850f_1733x879.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!azqR!,w_2400,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F825fae65-5593-4709-a293-aa043a59850f_1733x879.png" width="1200" height="609.065934065934" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/825fae65-5593-4709-a293-aa043a59850f_1733x879.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:false,&quot;imageSize&quot;:&quot;large&quot;,&quot;height&quot;:739,&quot;width&quot;:1456,&quot;resizeWidth&quot;:1200,&quot;bytes&quot;:138382,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://fullstackml.dev/i/149529739?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F825fae65-5593-4709-a293-aa043a59850f_1733x879.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-large" alt="" srcset="https://substackcdn.com/image/fetch/$s_!azqR!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F825fae65-5593-4709-a293-aa043a59850f_1733x879.png 424w, https://substackcdn.com/image/fetch/$s_!azqR!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F825fae65-5593-4709-a293-aa043a59850f_1733x879.png 848w, https://substackcdn.com/image/fetch/$s_!azqR!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F825fae65-5593-4709-a293-aa043a59850f_1733x879.png 1272w, https://substackcdn.com/image/fetch/$s_!azqR!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F825fae65-5593-4709-a293-aa043a59850f_1733x879.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Image: Serving our embedding model using FastAPI, Docker containers, and LanceDB vector database. </figcaption></figure></div><p></p><h2><strong>My Thoughts &amp; Where We're Headed</strong></h2><p>So, our ResNet 50 model sees those bird pictures and creates these "vector embeddings" &#8211; basically, a list of 2048 numbers that represent each image. Now that Generative AI is a big deal, these vector databases are making a comeback, and there are tons of them out there. But for what we're building, we need something solid for production: fast, easy to work with, and packed with features.</p><p>I actually first heard about LanceDB from this blog called <a href="https://thedataquarry.com/">The Data Quarry</a> by Prashanth Rao. Specifically, it was in his article &#8211; <a href="https://thedataquarry.com/blog/embedded-db-3/">"Embedded databases (3): LanceDB and the modular data stack."</a> The idea of a vector database that felt like SQLite really clicked with me. Think about SQLite &#8211; it's fast, lightweight, has a lot of cool stuff, and you can use it for your local experiments, but more importantly, you can rely on it in real production systems. That's what got me excited about LanceDB.</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!cQzA!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0a15197c-fb03-4e5f-b5cc-c0f0b971e610_1038x318.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!cQzA!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0a15197c-fb03-4e5f-b5cc-c0f0b971e610_1038x318.png 424w, https://substackcdn.com/image/fetch/$s_!cQzA!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0a15197c-fb03-4e5f-b5cc-c0f0b971e610_1038x318.png 848w, https://substackcdn.com/image/fetch/$s_!cQzA!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0a15197c-fb03-4e5f-b5cc-c0f0b971e610_1038x318.png 1272w, https://substackcdn.com/image/fetch/$s_!cQzA!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0a15197c-fb03-4e5f-b5cc-c0f0b971e610_1038x318.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!cQzA!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0a15197c-fb03-4e5f-b5cc-c0f0b971e610_1038x318.png" width="696" height="213.22543352601156" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/0a15197c-fb03-4e5f-b5cc-c0f0b971e610_1038x318.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:false,&quot;imageSize&quot;:&quot;normal&quot;,&quot;height&quot;:318,&quot;width&quot;:1038,&quot;resizeWidth&quot;:696,&quot;bytes&quot;:139904,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://fullstackml.dev/i/149529739?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0a15197c-fb03-4e5f-b5cc-c0f0b971e610_1038x318.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!cQzA!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0a15197c-fb03-4e5f-b5cc-c0f0b971e610_1038x318.png 424w, https://substackcdn.com/image/fetch/$s_!cQzA!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0a15197c-fb03-4e5f-b5cc-c0f0b971e610_1038x318.png 848w, https://substackcdn.com/image/fetch/$s_!cQzA!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0a15197c-fb03-4e5f-b5cc-c0f0b971e610_1038x318.png 1272w, https://substackcdn.com/image/fetch/$s_!cQzA!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0a15197c-fb03-4e5f-b5cc-c0f0b971e610_1038x318.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><p>What really surprised me was seeing how LanceDB stacked up against Elasticsearch as shown in the article. In a lot of ways, LanceDB actually performed better, and it did it using way fewer resources on a single computer. That's pretty impressive!</p><p>Using Docker containers is also a big part of our plan. It's like future-proofing our project for different ways we might want to deploy it, both while we're building it and when it's live. Because we're using Docker, we're not locked into one specific way of putting our API out there. We could use virtual machines, <a href="https://kubernetes.io/">Kubernetes</a> (which is super popular for managing containers), or even serverless options that can run containers, like <a href="https://aws.amazon.com/pm/lambda/">AWS Lambda</a> and <a href="https://azure.microsoft.com/en-au/products/functions/">Azure Functions</a>. Plus, as we talked about earlier, Docker makes life so much easier for developers. We can be pretty sure that if it works in our Docker container, it'll work anywhere else.  </p><h2>Challenges &amp; Unexpected Issues</h2><p>Okay, so this project threw a little curveball my way. It's the first time I've really used <code>conda</code> to manage the Python stuff. I've always been a <code>pyenv</code> + <code>python venv</code> kind of guy. For me, that setup just feels lighter and I don't have to think about it too much anymore &#8211; it's just natural. I gotta say, <code>conda</code> and <code>mamba</code> (which is supposed to be a faster version of <code>conda</code>) feel a bit heavier, and it seems like there are a few more commands I need to keep in mind.</p><p>But hey, I'm willing to give it some more time. I've heard from a lot of people that <code>conda</code> is actually better when it comes to making sure all your project's dependencies play nicely together. So, I'll stick with it for a bit and see if it wins me over. And then there&#8217;s <a href="https://github.com/astral-sh/uv">uv</a>, but I will save that for another day. </p><h2>Iterations &amp; Lessons Learned</h2><p>The whole point of starting this series was to get a better handle on how those vector embedding models actually work with images. With everyone talking about using vector databases for semantic search in those RAG (Retrieval-Augmented Generation) applications, I was curious if we could do the same thing with images &#8211; you know, compare them for similarity just like we do with text in NLP. It was pretty cool to find out that ResNet, which is actually a pretty old technology in the world of computer vision, can be trained to recognise objects, and that making an embedding model isn't as complicated as I first thought once you understand the basics.</p><p>Once I had a good grasp on building and training the embedding model, I moved on to creating an MLOps pipeline using <a href="https://metaflow.org/">Metaflow</a> and keeping track of my experiments with <a href="https://mlflow.org/">MLflow</a>. Before I really got into building models myself, I kind of thought that the model building part was the biggest chunk of creating a machine learning system. And don't get me wrong, it's definitely super important. But there are so many other things that happen around that modeling part &#8211; before you even start and after you're done &#8211; that when you look at the whole end-to-end system, the actual model building almost looks like a smaller piece of the puzzle.</p><p>After setting up the modeling pipeline, I now have a system that can handle all the steps &#8211; getting the data ready, training the model, and checking how good it is &#8211; in a way that I can repeat whenever I need to retrain in the future. That's pretty neat, but the thing is, nobody else can actually <em>use</em> this yet.</p><p>To make this image similarity thing available to everyone, we need an API. And that's where serving the model using FastAPI and Docker containers comes into play.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!iExQ!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fae779683-2afb-498e-b1aa-65138a17b380_1024x369.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!iExQ!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fae779683-2afb-498e-b1aa-65138a17b380_1024x369.png 424w, https://substackcdn.com/image/fetch/$s_!iExQ!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fae779683-2afb-498e-b1aa-65138a17b380_1024x369.png 848w, https://substackcdn.com/image/fetch/$s_!iExQ!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fae779683-2afb-498e-b1aa-65138a17b380_1024x369.png 1272w, https://substackcdn.com/image/fetch/$s_!iExQ!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fae779683-2afb-498e-b1aa-65138a17b380_1024x369.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!iExQ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fae779683-2afb-498e-b1aa-65138a17b380_1024x369.png" width="696" height="250.8046875" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/ae779683-2afb-498e-b1aa-65138a17b380_1024x369.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:false,&quot;imageSize&quot;:&quot;normal&quot;,&quot;height&quot;:369,&quot;width&quot;:1024,&quot;resizeWidth&quot;:696,&quot;bytes&quot;:19843,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://fullstackml.dev/i/149529739?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fae779683-2afb-498e-b1aa-65138a17b380_1024x369.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!iExQ!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fae779683-2afb-498e-b1aa-65138a17b380_1024x369.png 424w, https://substackcdn.com/image/fetch/$s_!iExQ!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fae779683-2afb-498e-b1aa-65138a17b380_1024x369.png 848w, https://substackcdn.com/image/fetch/$s_!iExQ!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fae779683-2afb-498e-b1aa-65138a17b380_1024x369.png 1272w, https://substackcdn.com/image/fetch/$s_!iExQ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fae779683-2afb-498e-b1aa-65138a17b380_1024x369.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Plus, using Docker not only gives us more options for putting this thing in the cloud, but it also makes the whole development process way easier. It helps us developers focus on actually building the stuff without getting bogged down in environment issues.</p><p>In addition to more cloud deployment options, a superior developer experience is produced, which help the developer even more focus on the job at hand. </p><h2>Final Results &amp; Reflection</h2><p>Whenever I build anything that uses Docker, I always make sure to use Docker Compose too. Honestly, this is mostly about making the development process smoother than anything else.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!a1LK!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6b3549bf-6e1c-45f7-bf56-d9f57c2771ea_1400x672.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!a1LK!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6b3549bf-6e1c-45f7-bf56-d9f57c2771ea_1400x672.jpeg 424w, https://substackcdn.com/image/fetch/$s_!a1LK!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6b3549bf-6e1c-45f7-bf56-d9f57c2771ea_1400x672.jpeg 848w, https://substackcdn.com/image/fetch/$s_!a1LK!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6b3549bf-6e1c-45f7-bf56-d9f57c2771ea_1400x672.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!a1LK!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6b3549bf-6e1c-45f7-bf56-d9f57c2771ea_1400x672.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!a1LK!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6b3549bf-6e1c-45f7-bf56-d9f57c2771ea_1400x672.jpeg" width="1400" height="672" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/6b3549bf-6e1c-45f7-bf56-d9f57c2771ea_1400x672.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:672,&quot;width&quot;:1400,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:58379,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpeg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://fullstackml.dev/i/159516237?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6b3549bf-6e1c-45f7-bf56-d9f57c2771ea_1400x672.jpeg&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!a1LK!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6b3549bf-6e1c-45f7-bf56-d9f57c2771ea_1400x672.jpeg 424w, https://substackcdn.com/image/fetch/$s_!a1LK!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6b3549bf-6e1c-45f7-bf56-d9f57c2771ea_1400x672.jpeg 848w, https://substackcdn.com/image/fetch/$s_!a1LK!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6b3549bf-6e1c-45f7-bf56-d9f57c2771ea_1400x672.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!a1LK!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6b3549bf-6e1c-45f7-bf56-d9f57c2771ea_1400x672.jpeg 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>The kinds of systems I usually work on involve one or more Docker containers, and it's just so handy and reliable to have a complete setup that just works on my computer and in the cloud without having to change any code. It's a real time-saver and makes things way less frustrating.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!S9ze!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd1057541-b028-4d51-bf29-165c58b99a2f_2970x1292.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!S9ze!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd1057541-b028-4d51-bf29-165c58b99a2f_2970x1292.png 424w, https://substackcdn.com/image/fetch/$s_!S9ze!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd1057541-b028-4d51-bf29-165c58b99a2f_2970x1292.png 848w, https://substackcdn.com/image/fetch/$s_!S9ze!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd1057541-b028-4d51-bf29-165c58b99a2f_2970x1292.png 1272w, https://substackcdn.com/image/fetch/$s_!S9ze!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd1057541-b028-4d51-bf29-165c58b99a2f_2970x1292.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!S9ze!,w_2400,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd1057541-b028-4d51-bf29-165c58b99a2f_2970x1292.png" width="1200" height="521.7032967032967" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/d1057541-b028-4d51-bf29-165c58b99a2f_2970x1292.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:false,&quot;imageSize&quot;:&quot;large&quot;,&quot;height&quot;:633,&quot;width&quot;:1456,&quot;resizeWidth&quot;:1200,&quot;bytes&quot;:670306,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://fullstackml.dev/i/149529739?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd1057541-b028-4d51-bf29-165c58b99a2f_2970x1292.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-large" alt="" srcset="https://substackcdn.com/image/fetch/$s_!S9ze!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd1057541-b028-4d51-bf29-165c58b99a2f_2970x1292.png 424w, https://substackcdn.com/image/fetch/$s_!S9ze!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd1057541-b028-4d51-bf29-165c58b99a2f_2970x1292.png 848w, https://substackcdn.com/image/fetch/$s_!S9ze!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd1057541-b028-4d51-bf29-165c58b99a2f_2970x1292.png 1272w, https://substackcdn.com/image/fetch/$s_!S9ze!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd1057541-b028-4d51-bf29-165c58b99a2f_2970x1292.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">caption...</figcaption></figure></div><p>For this project, I actually used <a href="https://www.cursor.com/">Cursor</a>, which is an AI-powered code editor, to help me put together the API and the ReactJS website (I'll be talking more about the website in my next post). I've used FastAPI before, and instead of the usual endless Googling and copy-pasting to get things going, Cursor really sped up the initial setup. I definitely had to guide it along the way quite a bit, but it definitely made my work easier. It's like having a coding buddy that knows a lot of the basics.</p><p>And LanceDB? It just works, plain and simple. If you ever need a solid, production-ready vector database that you can embed right into your application, I can now confidently say that LanceDB is the way to go. Even though it saves its data to your local hard drive by default, you can easily switch it to use cloud storage like AWS S3 or Azure Blob Storage if you need to. That makes it really flexible.</p><p>Building the API with FastAPI is fantastic, especially when you compare it to how you build APIs using something like AWS Lambda handlers (trust me, FastAPI is much nicer!). The built-in data validation, the easy way it handles data formats, and the fact that it can handle things asynchronously (doing multiple things at once) are all great features.</p><p>But honestly, you don't really see the magic of similarity search with vector databases until you have a user interface to play with. Seeing the search results with your own eyes is just... well, it feels like magic. You type in a description, or in our case, you'll upload a picture, and it instantly finds similar things. It's pretty cool.</p><h2>Why don&#8217;t you give it a go!</h2><p>You can find all the code for <a href="https://github.com/jaeyow/bird-species-embedding-model">this project on GitHub</a>. Feel free to check it out and try it for yourself &#8211; it's all open source!</p><p>If you have any questions, drop a comment here on Substack, or send me <a href="http://linkedin.com/in/joreyes">a DM on LinkedIn</a>.</p><p>And hey, if you're interested in collaborating on something cool, let's chat! I'm always up for new ideas and working with other passionate people.</p><p></p>]]></content:encoded></item><item><title><![CDATA[#7 - Streamline Your AI Workflow: A Factory Class for LLMs and Embedding Models ]]></title><description><![CDATA[LLMFactory and EmbeddingModelFactory classes that will improve the developer experience of an AI Engineer]]></description><link>https://www.fullstackml.dev/p/7-streamline-your-ai-workflow-a-factory</link><guid isPermaLink="false">https://www.fullstackml.dev/p/7-streamline-your-ai-workflow-a-factory</guid><dc:creator><![CDATA[JO Reyes]]></dc:creator><pubDate>Sun, 23 Feb 2025 05:01:16 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!cewi!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6805a107-a45a-4602-a37c-ea130220a674_6000x4000.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Hi friends,</p><p>In today&#8217;s article, we will be talking about some software engineering best practices that will improve the developer experience as we build Generative AI applications. One of my annoyances in this space is the rapid advancements where we see new models in what feels like every week. </p><p>This means I regularly have to update my applications or scripts to accomodate this. I know that <a href="https://www.llamaindex.ai/">LlamaIndex</a> and <a href="https://www.langchain.com/langchain">LangChain</a> have a LLMFactory pattern that handles this scenario, however, I haven&#8217;t seen a library that does this for <a href="https://python.useinstructor.com/">Instructor</a>-based projects (which is my preference), until now.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!cewi!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6805a107-a45a-4602-a37c-ea130220a674_6000x4000.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!cewi!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6805a107-a45a-4602-a37c-ea130220a674_6000x4000.jpeg 424w, https://substackcdn.com/image/fetch/$s_!cewi!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6805a107-a45a-4602-a37c-ea130220a674_6000x4000.jpeg 848w, https://substackcdn.com/image/fetch/$s_!cewi!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6805a107-a45a-4602-a37c-ea130220a674_6000x4000.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!cewi!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6805a107-a45a-4602-a37c-ea130220a674_6000x4000.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!cewi!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6805a107-a45a-4602-a37c-ea130220a674_6000x4000.jpeg" width="1456" height="971" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/6805a107-a45a-4602-a37c-ea130220a674_6000x4000.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:971,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:4024668,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpeg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://fullstackml.dev/i/157717128?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6805a107-a45a-4602-a37c-ea130220a674_6000x4000.jpeg&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!cewi!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6805a107-a45a-4602-a37c-ea130220a674_6000x4000.jpeg 424w, https://substackcdn.com/image/fetch/$s_!cewi!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6805a107-a45a-4602-a37c-ea130220a674_6000x4000.jpeg 848w, https://substackcdn.com/image/fetch/$s_!cewi!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6805a107-a45a-4602-a37c-ea130220a674_6000x4000.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!cewi!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6805a107-a45a-4602-a37c-ea130220a674_6000x4000.jpeg 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">A factory pattern for your LLMs and embedding models</figcaption></figure></div><p></p><p>I forked this from <a href="https://github.com/daveebbelaar/pgvectorscale-rag-solution">one of Dave Ebbelaar's repo</a> to play with the code and understand how it works. It is awesome to have a repo filled with best practices and a well-documented codebase. I&#8217;ve been following his work for a few months now, and its still amazing to me how I continuously find diamonds in Youtube, the calibre of learning materials and generosity is top-notch.</p><p>Here are some learnings from this codebase:</p><ul><li><p>I have always used <strong>Jupyter notebooks</strong> to experiment with code, but this is the first time I have used interactive Python in VSCode, which I thought is a great experience. Although I will not be doing this all the time since I still enjoy the usefulness of Jupyter notebooks, I will be using <a href="https://code.visualstudio.com/docs/python/jupyter-support-py">interactive Python in VSCode</a> more often.</p><p></p></li><li><p>I like his use of the <strong>LLMFactory</strong> to make it easy to swap out different models, as I mentioned, frequent changes in this space, there&#8217;s always a new SOTA model around the corner. This is a great way to make the codebase more modular and easier to maintain. Also, its simple enough to extend and modify to adapt to your workflow.</p><p></p></li><li><p>It is still using the <strong><a href="https://python.useinstructor.com/">instructor library</a></strong>, which I find indispensable when working with LLMs and structured outputs.</p><p></p></li><li><p>Use <a href="https://docs.pydantic.dev/latest/concepts/pydantic_settings">Pydantic Settings</a> as powerful way to manage and centralise your application&#8217;s settings, in this case, we are loading our application settings from an .env file, and we want to manage all from the one source file. </p></li></ul><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!3FxF!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F04b54bb8-16cc-4656-b062-9c61a59f5e65_1920x1080.png" data-component-name="Image2ToDOM"><div class="image2-inset image2-full-screen"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!3FxF!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F04b54bb8-16cc-4656-b062-9c61a59f5e65_1920x1080.png 424w, https://substackcdn.com/image/fetch/$s_!3FxF!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F04b54bb8-16cc-4656-b062-9c61a59f5e65_1920x1080.png 848w, https://substackcdn.com/image/fetch/$s_!3FxF!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F04b54bb8-16cc-4656-b062-9c61a59f5e65_1920x1080.png 1272w, https://substackcdn.com/image/fetch/$s_!3FxF!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F04b54bb8-16cc-4656-b062-9c61a59f5e65_1920x1080.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!3FxF!,w_5760,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F04b54bb8-16cc-4656-b062-9c61a59f5e65_1920x1080.png" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/04b54bb8-16cc-4656-b062-9c61a59f5e65_1920x1080.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:false,&quot;imageSize&quot;:&quot;full&quot;,&quot;height&quot;:819,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:329482,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://fullstackml.dev/i/157717128?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F04b54bb8-16cc-4656-b062-9c61a59f5e65_1920x1080.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-fullscreen" alt="" srcset="https://substackcdn.com/image/fetch/$s_!3FxF!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F04b54bb8-16cc-4656-b062-9c61a59f5e65_1920x1080.png 424w, https://substackcdn.com/image/fetch/$s_!3FxF!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F04b54bb8-16cc-4656-b062-9c61a59f5e65_1920x1080.png 848w, https://substackcdn.com/image/fetch/$s_!3FxF!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F04b54bb8-16cc-4656-b062-9c61a59f5e65_1920x1080.png 1272w, https://substackcdn.com/image/fetch/$s_!3FxF!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F04b54bb8-16cc-4656-b062-9c61a59f5e65_1920x1080.png 1456w" sizes="100vw"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">For example, using Anthropic Claude Sonnet 3.5 in AWS Bedrock, with AWS Titan embedding model</figcaption></figure></div><p></p><p>As I was running and playing with the code, there are a few issues that I thought would have been better, so here are the <a href="https://github.com/jaeyow/pgvectorscale-rag-solution">improvements I have added to this fork</a>:</p><ul><li><p>I have <strong>improved LLMFactory</strong> so that the LLM registrations are done separately, meaning you will not need to update the LLMFactory class every time you add a new LLM. When a new model comes around, all we have to is register the new model by adding the <strong>register_llm_client</strong> decorator on the new model-specific code that returns the instructor client.</p><p></p></li><li><p>I have added support for <strong><a href="https://aws.amazon.com/bedrock">Amazon Bedrock</a></strong> to have access the <strong><a href="https://www.anthropic.com/claude">Anthropic Claude</a></strong> models there. A bit of adjustments to <strong>Bedrock embedding models</strong> so that the calls are now identical to <a href="https://openai.com/">OpenAI</a> or <a href="https://ollama.com/">Ollama</a> embedding models.</p><p></p></li><li><p>I have created a similar factory class for embedding models (<strong>EmbeddingModelFactory</strong>), so that you can now mix and match <strong>different LLMs</strong> with <strong>different embedding models</strong>.</p><p></p></li><li><p>This means I can now use say <strong>DeepSeek R1 running on Ollama locally</strong> and <strong>Titan embedding model</strong> from AWS.</p><p></p></li><li><p>Or <strong>Anthropic Claude Sonnet 3.5</strong> in Amazon Bedrock with <strong>Ollama local embedding model</strong>.</p><p></p></li><li><p>Or <strong>both the LLM and embedding model from Amazon Bedrock</strong>. This is a great way to experiment with different models and not have to change a lot of code.</p></li></ul><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!z1jD!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F13b739cb-7f4d-4808-8320-90b4a9c0cecb_1750x1526.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!z1jD!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F13b739cb-7f4d-4808-8320-90b4a9c0cecb_1750x1526.png 424w, https://substackcdn.com/image/fetch/$s_!z1jD!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F13b739cb-7f4d-4808-8320-90b4a9c0cecb_1750x1526.png 848w, https://substackcdn.com/image/fetch/$s_!z1jD!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F13b739cb-7f4d-4808-8320-90b4a9c0cecb_1750x1526.png 1272w, https://substackcdn.com/image/fetch/$s_!z1jD!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F13b739cb-7f4d-4808-8320-90b4a9c0cecb_1750x1526.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!z1jD!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F13b739cb-7f4d-4808-8320-90b4a9c0cecb_1750x1526.png" width="1456" height="1270" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/13b739cb-7f4d-4808-8320-90b4a9c0cecb_1750x1526.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1270,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:275741,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://fullstackml.dev/i/157717128?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F13b739cb-7f4d-4808-8320-90b4a9c0cecb_1750x1526.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!z1jD!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F13b739cb-7f4d-4808-8320-90b4a9c0cecb_1750x1526.png 424w, https://substackcdn.com/image/fetch/$s_!z1jD!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F13b739cb-7f4d-4808-8320-90b4a9c0cecb_1750x1526.png 848w, https://substackcdn.com/image/fetch/$s_!z1jD!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F13b739cb-7f4d-4808-8320-90b4a9c0cecb_1750x1526.png 1272w, https://substackcdn.com/image/fetch/$s_!z1jD!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F13b739cb-7f4d-4808-8320-90b4a9c0cecb_1750x1526.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>I have added a new decorator <strong>register_llm_client</strong> which is nothing but a lookup table for the registered LLM clients. </p><p></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!OWcf!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9fedbe93-1ed1-4adf-ae06-649e2755e18b_1834x2310.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!OWcf!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9fedbe93-1ed1-4adf-ae06-649e2755e18b_1834x2310.png 424w, https://substackcdn.com/image/fetch/$s_!OWcf!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9fedbe93-1ed1-4adf-ae06-649e2755e18b_1834x2310.png 848w, https://substackcdn.com/image/fetch/$s_!OWcf!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9fedbe93-1ed1-4adf-ae06-649e2755e18b_1834x2310.png 1272w, https://substackcdn.com/image/fetch/$s_!OWcf!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9fedbe93-1ed1-4adf-ae06-649e2755e18b_1834x2310.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!OWcf!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9fedbe93-1ed1-4adf-ae06-649e2755e18b_1834x2310.png" width="1456" height="1834" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/9fedbe93-1ed1-4adf-ae06-649e2755e18b_1834x2310.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1834,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:529753,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://fullstackml.dev/i/157717128?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9fedbe93-1ed1-4adf-ae06-649e2755e18b_1834x2310.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!OWcf!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9fedbe93-1ed1-4adf-ae06-649e2755e18b_1834x2310.png 424w, https://substackcdn.com/image/fetch/$s_!OWcf!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9fedbe93-1ed1-4adf-ae06-649e2755e18b_1834x2310.png 848w, https://substackcdn.com/image/fetch/$s_!OWcf!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9fedbe93-1ed1-4adf-ae06-649e2755e18b_1834x2310.png 1272w, https://substackcdn.com/image/fetch/$s_!OWcf!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9fedbe93-1ed1-4adf-ae06-649e2755e18b_1834x2310.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>To support a new LLM, just decorate the function with <strong>register_llm_client</strong>, and that is all you need to do for the registration.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!u3Sm!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5a4ef59f-3fda-4a65-add9-6d56aca4907e_1768x1676.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!u3Sm!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5a4ef59f-3fda-4a65-add9-6d56aca4907e_1768x1676.png 424w, https://substackcdn.com/image/fetch/$s_!u3Sm!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5a4ef59f-3fda-4a65-add9-6d56aca4907e_1768x1676.png 848w, https://substackcdn.com/image/fetch/$s_!u3Sm!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5a4ef59f-3fda-4a65-add9-6d56aca4907e_1768x1676.png 1272w, https://substackcdn.com/image/fetch/$s_!u3Sm!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5a4ef59f-3fda-4a65-add9-6d56aca4907e_1768x1676.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!u3Sm!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5a4ef59f-3fda-4a65-add9-6d56aca4907e_1768x1676.png" width="1456" height="1380" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/5a4ef59f-3fda-4a65-add9-6d56aca4907e_1768x1676.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1380,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:468503,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://fullstackml.dev/i/157717128?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5a4ef59f-3fda-4a65-add9-6d56aca4907e_1768x1676.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!u3Sm!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5a4ef59f-3fda-4a65-add9-6d56aca4907e_1768x1676.png 424w, https://substackcdn.com/image/fetch/$s_!u3Sm!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5a4ef59f-3fda-4a65-add9-6d56aca4907e_1768x1676.png 848w, https://substackcdn.com/image/fetch/$s_!u3Sm!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5a4ef59f-3fda-4a65-add9-6d56aca4907e_1768x1676.png 1272w, https://substackcdn.com/image/fetch/$s_!u3Sm!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5a4ef59f-3fda-4a65-add9-6d56aca4907e_1768x1676.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>In the LLMFactory constructor, simply call <strong>get_llm_client</strong>, and this will return that provider&#8217;s client code!</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!5VTE!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa6089e8a-677a-4d13-92a0-b5ec11d075c5_1920x1080.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!5VTE!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa6089e8a-677a-4d13-92a0-b5ec11d075c5_1920x1080.png 424w, https://substackcdn.com/image/fetch/$s_!5VTE!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa6089e8a-677a-4d13-92a0-b5ec11d075c5_1920x1080.png 848w, https://substackcdn.com/image/fetch/$s_!5VTE!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa6089e8a-677a-4d13-92a0-b5ec11d075c5_1920x1080.png 1272w, https://substackcdn.com/image/fetch/$s_!5VTE!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa6089e8a-677a-4d13-92a0-b5ec11d075c5_1920x1080.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!5VTE!,w_2400,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa6089e8a-677a-4d13-92a0-b5ec11d075c5_1920x1080.png" width="1200" height="675" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/a6089e8a-677a-4d13-92a0-b5ec11d075c5_1920x1080.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:false,&quot;imageSize&quot;:&quot;large&quot;,&quot;height&quot;:819,&quot;width&quot;:1456,&quot;resizeWidth&quot;:1200,&quot;bytes&quot;:323644,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://fullstackml.dev/i/157717128?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa6089e8a-677a-4d13-92a0-b5ec11d075c5_1920x1080.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-large" alt="" srcset="https://substackcdn.com/image/fetch/$s_!5VTE!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa6089e8a-677a-4d13-92a0-b5ec11d075c5_1920x1080.png 424w, https://substackcdn.com/image/fetch/$s_!5VTE!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa6089e8a-677a-4d13-92a0-b5ec11d075c5_1920x1080.png 848w, https://substackcdn.com/image/fetch/$s_!5VTE!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa6089e8a-677a-4d13-92a0-b5ec11d075c5_1920x1080.png 1272w, https://substackcdn.com/image/fetch/$s_!5VTE!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa6089e8a-677a-4d13-92a0-b5ec11d075c5_1920x1080.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Next example, using DeepSeek R1 LLM with AWS Titan embedding model</figcaption></figure></div><p>If you want to play with LLMFactory, to check this out and play with this more, please have a look at the Dave&#8217;s repo at <a href="https://github.com/daveebbelaar/pgvectorscale-rag-solution">https://github.com/daveebbelaar/pgvectorscale-rag-solution</a> or my fork at <a href="https://github.com/jaeyow/pgvectorscale-rag-solution">https://github.com/jaeyow/pgvectorscale-rag-solution</a>.</p><p>Till then,<br><br>JO</p>]]></content:encoded></item><item><title><![CDATA[#6 - Modelling with Metaflow and MLFlow]]></title><description><![CDATA[After building a model in a Jupyter Notebook, we port our code to be production-ready and we use Metaflow for orchestration and MLFlow for experiment tracking]]></description><link>https://www.fullstackml.dev/p/6-using-metaflow-and-mlflow</link><guid isPermaLink="false">https://www.fullstackml.dev/p/6-using-metaflow-and-mlflow</guid><dc:creator><![CDATA[JO Reyes]]></dc:creator><pubDate>Mon, 02 Dec 2024 10:00:58 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!tf7l!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdd782094-9e66-4b65-a69b-7584a7aec62d_1234x1137.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>This article is <strong>part of a blog series about demistifying vector embedding models</strong> for use in image embeddings:</p><p><a href="https://fullstackml.dev/p/4-so-how-do-you-build-an-image-embedding">Part 1. So how do you build a vector embedding model?</a> - Introduces vector embedding models and the intuition behind the technologies we can use to build one ourselves.<br><br><a href="https://fullstackml.dev/p/5-lets-build-our-embedding-model">Part 2. Let's build our image embedding model</a> - Shows a couple of ways to build embedding models - first by using a <strong>pre-trained model</strong>, and next by <strong>fine-tuning</strong> a pre-trained model. We use PyTorch to build our feature extractor.</p><p><a href="https://fullstackml.dev/p/6-using-metaflow-and-mlflow">Part 3. Modelling with Metaflow and MLFlow</a> - (this article) - Here we are using Metaflow to build our model training workflow, where we introduce the concept of checkpointing, and MLFlow for experiment tracking.</p><p><a href="https://fullstackml.dev/p/8-from-training-to-deployment-a-simple">Part 4. From Training to Deployment: A Simple Approach to Serving Embedding Models</a> -Packaging your ML model in a Docker container opens it up to a multitude of model serving options.</p><p><a href="https://fullstackml.dev/p/9-putting-our-bird-embedding-model">Part 5. Putting Our Bird Embedding Model to Work: Introducing the Web Frontend</a> -For our embedding model to prove useful to others, we have created a modern frontend to serve the similarity inference to our users.</p><div><hr></div><p>Hi friends, </p><p><a href="https://fullstackml.dev/p/5-lets-build-our-embedding-model">In the previous article</a>, we explored two approaches to creating an image embedding model, demonstrated through Jupyter notebooks: <a href="https://github.com/jaeyow/bird-species-embedding-model/blob/main/notebooks/01-using-a-pre-trained-model.ipynb">one for using a pre-trained model</a> and <a href="https://github.com/jaeyow/bird-species-embedding-model/blob/main/notebooks/02-fine-tuning-a-pre-trained-model.ipynb">another for fine-tuning that pre-trained model</a>. These examples showed how to build an embedding model both by leveraging an existing pre-trained model and by fine-tuning it for our specific use case.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!tf7l!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdd782094-9e66-4b65-a69b-7584a7aec62d_1234x1137.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!tf7l!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdd782094-9e66-4b65-a69b-7584a7aec62d_1234x1137.png 424w, https://substackcdn.com/image/fetch/$s_!tf7l!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdd782094-9e66-4b65-a69b-7584a7aec62d_1234x1137.png 848w, https://substackcdn.com/image/fetch/$s_!tf7l!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdd782094-9e66-4b65-a69b-7584a7aec62d_1234x1137.png 1272w, https://substackcdn.com/image/fetch/$s_!tf7l!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdd782094-9e66-4b65-a69b-7584a7aec62d_1234x1137.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!tf7l!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdd782094-9e66-4b65-a69b-7584a7aec62d_1234x1137.png" width="1234" height="1137" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/dd782094-9e66-4b65-a69b-7584a7aec62d_1234x1137.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1137,&quot;width&quot;:1234,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1176403,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!tf7l!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdd782094-9e66-4b65-a69b-7584a7aec62d_1234x1137.png 424w, https://substackcdn.com/image/fetch/$s_!tf7l!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdd782094-9e66-4b65-a69b-7584a7aec62d_1234x1137.png 848w, https://substackcdn.com/image/fetch/$s_!tf7l!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdd782094-9e66-4b65-a69b-7584a7aec62d_1234x1137.png 1272w, https://substackcdn.com/image/fetch/$s_!tf7l!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdd782094-9e66-4b65-a69b-7584a7aec62d_1234x1137.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p><em>Image: We built an embedding model able to distinguish between bird species</em>. </p><p>These are a good introduction, however this is still very far from having the ability to build machine learning models for real. Models that can be easily trained and prepared for online inference on the cloud, or in other words get them as close to production-ready as possible.</p><p><strong>Introducing Metaflow</strong></p><p>In this article, we&#8217;ll be introducing Metaflow, a fantastic machine learning orchestrator <a href="https://netflixtechblog.com/open-sourcing-metaflow-a-human-centric-framework-for-data-science-fa72e04a5d9">open-sourced in 2019</a> after being developed and battle-tested in Netflix to run their internal ML, AI and Data projects. I don&#8217;t want to have to repeat myself introducing Metaflow, <a href="https://docs.metaflow.org/metaflow/introduction">their excellent documentation</a> already does this better than I ever can.</p><p>According to their website: &#8220;Metaflow is a human-friendly Python library that makes it straightforward to develop, deploy, and operate various kinds of data-intensive applications, in particular those involving data science, ML, and AI.&#8220;</p><p>I have been using Metaflow for a while now, <a href="https://cevo.com.au/post/build-train-deploy-your-own-recommender-system/#elementor-toc__heading-anchor-13">here&#8217;s a past project</a> that used it as the ML orchestrator. This small project allowed me to understand first-hand the power of this tool.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!4_dd!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F25d3c8b1-8981-48a0-86e8-b0e012e802ed_1434x808.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!4_dd!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F25d3c8b1-8981-48a0-86e8-b0e012e802ed_1434x808.png 424w, https://substackcdn.com/image/fetch/$s_!4_dd!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F25d3c8b1-8981-48a0-86e8-b0e012e802ed_1434x808.png 848w, https://substackcdn.com/image/fetch/$s_!4_dd!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F25d3c8b1-8981-48a0-86e8-b0e012e802ed_1434x808.png 1272w, https://substackcdn.com/image/fetch/$s_!4_dd!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F25d3c8b1-8981-48a0-86e8-b0e012e802ed_1434x808.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!4_dd!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F25d3c8b1-8981-48a0-86e8-b0e012e802ed_1434x808.png" width="1434" height="808" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/25d3c8b1-8981-48a0-86e8-b0e012e802ed_1434x808.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:808,&quot;width&quot;:1434,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:394721,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!4_dd!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F25d3c8b1-8981-48a0-86e8-b0e012e802ed_1434x808.png 424w, https://substackcdn.com/image/fetch/$s_!4_dd!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F25d3c8b1-8981-48a0-86e8-b0e012e802ed_1434x808.png 848w, https://substackcdn.com/image/fetch/$s_!4_dd!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F25d3c8b1-8981-48a0-86e8-b0e012e802ed_1434x808.png 1272w, https://substackcdn.com/image/fetch/$s_!4_dd!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F25d3c8b1-8981-48a0-86e8-b0e012e802ed_1434x808.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p><em>Image: The stacks in Metaflow.</em> </p><p><strong>Metaflow super powers</strong></p><p>The real strength of this tool lies in its ability to streamline the transition from early prototyping to production. You can start by developing your solution locally on a laptop, ensuring an intuitive and straightforward experience. Then, with minimal code changes, you can scale your code to leverage your preferred cloud platform&#8212;whether AWS, Google Cloud, or Azure&#8212;for compute and storage.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!mSj1!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd0494559-d289-43c1-9c26-33b6422bfe42_1632x814.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!mSj1!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd0494559-d289-43c1-9c26-33b6422bfe42_1632x814.png 424w, https://substackcdn.com/image/fetch/$s_!mSj1!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd0494559-d289-43c1-9c26-33b6422bfe42_1632x814.png 848w, https://substackcdn.com/image/fetch/$s_!mSj1!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd0494559-d289-43c1-9c26-33b6422bfe42_1632x814.png 1272w, https://substackcdn.com/image/fetch/$s_!mSj1!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd0494559-d289-43c1-9c26-33b6422bfe42_1632x814.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!mSj1!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd0494559-d289-43c1-9c26-33b6422bfe42_1632x814.png" width="1456" height="726" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/d0494559-d289-43c1-9c26-33b6422bfe42_1632x814.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:726,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:154785,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!mSj1!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd0494559-d289-43c1-9c26-33b6422bfe42_1632x814.png 424w, https://substackcdn.com/image/fetch/$s_!mSj1!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd0494559-d289-43c1-9c26-33b6422bfe42_1632x814.png 848w, https://substackcdn.com/image/fetch/$s_!mSj1!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd0494559-d289-43c1-9c26-33b6422bfe42_1632x814.png 1272w, https://substackcdn.com/image/fetch/$s_!mSj1!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd0494559-d289-43c1-9c26-33b6422bfe42_1632x814.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p><em>Image: Metaflow enables a simpler process from developing locally to full production system.</em> </p><div><hr></div><p>Here&#8217;s <a href="https://github.com/jaeyow/bird-species-embedding-model/blob/main/flows/01-resnet-fine-tuning-flow.py">the Metaflow pipeline I created for training our bird species embedding model</a> (Github link).</p><div><hr></div><p>This is truly ingenious because it empowers data scientists to productionise their models independently. Traditionally, this responsibility often fell to Machine Learning Engineers (MLEs), who had to port and adapt the models for production. While effective, this approach introduces an additional layer of complexity, which can disrupt the process.</p><p>Data scientists, who understand their models best, should be the ones bringing their creations to life. This tool eliminates that barrier, enabling them to do so with ease. Ultimately, it allows you to achieve more with fewer resources.</p><p>Since the day it&#8217;s been open sourced, more and more materials have been made available for us, such as these <a href="https://docs.metaflow.org/getting-started/tutorials">great tutorial resources</a>. They have made it even easier by providing <a href="https://docs.outerbounds.com/sandbox/">Metaflow in a sandbox environment</a>, so that you don&#8217;t even have to set it up in your cloud account. </p><p><strong>Using Metaflow to build our model</strong></p><p>So to continue the the article and the bird classification use case, I&#8217;ve built an MLOps pipeline that is almost ready for production. I say almost, because I&#8217;m not really deploying this to any cloud environment, although there will be a minimal effort to do so. Here is the python code for this flow. </p><p>With Python, Metaflow and MLFlow, we have built an ML System that:</p><ul><li><p>have the ability to build Python-based Directed Acyclic Graphs (DAGs)  </p></li><li><p>have version control as a native feature</p></li><li><p>allow you to easily scale up your pipeline from prototype to production</p></li><li><p>enables you to have an ML system that is library agnostic, so that you can bring in your favourite ML library, like scikit learn, <a href="https://pytorch.org/">PyTorch</a> or <a href="https://www.tensorflow.org/">TensorFlow</a>. </p></li><li><p>allows you to easily debug your pipelines, as well as create a more fault-tolerant system</p></li><li><p>With <a href="https://mlflow.org/">MLFlow</a>, we have also added the ability to log and track experiments, to ensure a system that is dependable and reproducible. This includes a model registry functionality that makes tracking your models with ease. </p></li></ul><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!uSy7!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F28d5cb56-38ff-4c2a-a263-ba4b65bcfcea_3014x1628.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!uSy7!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F28d5cb56-38ff-4c2a-a263-ba4b65bcfcea_3014x1628.png 424w, https://substackcdn.com/image/fetch/$s_!uSy7!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F28d5cb56-38ff-4c2a-a263-ba4b65bcfcea_3014x1628.png 848w, https://substackcdn.com/image/fetch/$s_!uSy7!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F28d5cb56-38ff-4c2a-a263-ba4b65bcfcea_3014x1628.png 1272w, https://substackcdn.com/image/fetch/$s_!uSy7!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F28d5cb56-38ff-4c2a-a263-ba4b65bcfcea_3014x1628.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!uSy7!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F28d5cb56-38ff-4c2a-a263-ba4b65bcfcea_3014x1628.png" width="1456" height="786" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/28d5cb56-38ff-4c2a-a263-ba4b65bcfcea_3014x1628.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:786,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:340677,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!uSy7!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F28d5cb56-38ff-4c2a-a263-ba4b65bcfcea_3014x1628.png 424w, https://substackcdn.com/image/fetch/$s_!uSy7!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F28d5cb56-38ff-4c2a-a263-ba4b65bcfcea_3014x1628.png 848w, https://substackcdn.com/image/fetch/$s_!uSy7!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F28d5cb56-38ff-4c2a-a263-ba4b65bcfcea_3014x1628.png 1272w, https://substackcdn.com/image/fetch/$s_!uSy7!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F28d5cb56-38ff-4c2a-a263-ba4b65bcfcea_3014x1628.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p><em>Image: MLFlow metrics page showing this training run&#8217;s metrics</em></p><p><strong>Model Checkpoints</strong></p><p>One of the key features I&#8217;ve added to this workflow is support for <strong>model checkpoints</strong>. Before diving deeper, let&#8217;s clarify what checkpoints are. As described in <a href="https://towardsdatascience.com/ml-design-pattern-2-checkpoints-e6ca25a4c5fe">an article on </a><em><a href="https://towardsdatascience.com/ml-design-pattern-2-checkpoints-e6ca25a4c5fe">Towards Data Science</a></em>, a checkpoint is a snapshot of a model&#8217;s entire internal state at a given point during training. At the end of training, this state is saved into a binary file, which can later be loaded to enable predictions.</p><p>However, best practices suggest going beyond saving the model only at the end of training. It&#8217;s far more effective to periodically save checkpoints at critical stages of the training cycle.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!YabZ!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F79f4dbef-e442-40a0-9ffb-d64f05c51947_948x344.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!YabZ!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F79f4dbef-e442-40a0-9ffb-d64f05c51947_948x344.png 424w, https://substackcdn.com/image/fetch/$s_!YabZ!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F79f4dbef-e442-40a0-9ffb-d64f05c51947_948x344.png 848w, https://substackcdn.com/image/fetch/$s_!YabZ!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F79f4dbef-e442-40a0-9ffb-d64f05c51947_948x344.png 1272w, https://substackcdn.com/image/fetch/$s_!YabZ!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F79f4dbef-e442-40a0-9ffb-d64f05c51947_948x344.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!YabZ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F79f4dbef-e442-40a0-9ffb-d64f05c51947_948x344.png" width="948" height="344" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/79f4dbef-e442-40a0-9ffb-d64f05c51947_948x344.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:344,&quot;width&quot;:948,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:87931,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!YabZ!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F79f4dbef-e442-40a0-9ffb-d64f05c51947_948x344.png 424w, https://substackcdn.com/image/fetch/$s_!YabZ!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F79f4dbef-e442-40a0-9ffb-d64f05c51947_948x344.png 848w, https://substackcdn.com/image/fetch/$s_!YabZ!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F79f4dbef-e442-40a0-9ffb-d64f05c51947_948x344.png 1272w, https://substackcdn.com/image/fetch/$s_!YabZ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F79f4dbef-e442-40a0-9ffb-d64f05c51947_948x344.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p><em>Image: Periodic checkpointing (image by Lak Lakshmanan)</em></p><p>Most people think of checkpointing simply as a way to save a trained model for later inference, and while this is important, the benefits of checkpointing extend much further. One of its most significant advantages is enabling a fault-tolerant training system&#8212;a feature I cannot stress enough.</p><p>Imagine you&#8217;re running a <strong>10-hour training cycle</strong>, and disaster strikes&#8212;be it hardware failure, a software crash, or some other catastrophic issue&#8212;9 hours in. Without checkpoints, you&#8217;d need to start training from scratch, wasting time and resources on another 10-hour cycle. However, with periodic checkpoints, you can simply resume training from the most recent saved state. For instance, if you save your model every hour, you&#8217;d only need to retrain from hour 9, saving precious time and effort.</p><p>That said, it&#8217;s important to strike a balance in how frequently you save checkpoints. Saving too often can waste storage and slow down training, while saving too infrequently may not provide enough recovery points to be useful. Thoughtful checkpointing ensures you maximise efficiency and minimise risk.</p><p><strong>Conclusion</strong></p><div><hr></div><p>Here&#8217;s <a href="https://github.com/jaeyow/bird-species-embedding-model/blob/main/flows/01-resnet-fine-tuning-flow.py">the Metaflow pipeline I created for training our bird species embedding model</a> (Github link).</p><div><hr></div><p>As discussed in previous articles, we approached this by initially treating the task as a classification problem. If you examine the code, you'll notice it follows a standard PyTorch model training process. However, the key step is removing the classification head before saving the final checkpoint. This transforms the model from a classifier into an embedding model capable of generating a 2048-dimensional embedding vector for each bird image.</p><p>We ported our model training notebooks into a production-ready workflow using Metaflow DAGs. Metaflow already works for local development, however for MLFlow, there was a little bit of configuration to make it work in a local Docker setup.</p><p>This wraps up our process for building embedding models. It was an excellent opportunity to improve our understanding of deep learning for image processing. While ResNet and similar architectures are considered "old tech" (dating back to 2015), they remain foundational and widely used in industries like manufacturing and medical diagnostics.<br></p><p>Till then, <br><br>JO</p>]]></content:encoded></item><item><title><![CDATA[#5 - Let's build our image embedding model]]></title><description><![CDATA[Building an embedding model is not hard, and in this article we'll have a look at a couple of options]]></description><link>https://www.fullstackml.dev/p/5-lets-build-our-embedding-model</link><guid isPermaLink="false">https://www.fullstackml.dev/p/5-lets-build-our-embedding-model</guid><dc:creator><![CDATA[JO Reyes]]></dc:creator><pubDate>Sun, 06 Oct 2024 22:01:20 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F96a56461-7992-40c4-b70c-28570243af36_1225x1123.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>This article is <strong>part of a blog series about demistifying vector embedding models</strong> for use in image embeddings:</p><p><a href="https://fullstackml.dev/p/4-so-how-do-you-build-an-image-embedding">Part 1. So how do you build a vector embedding model?</a> - Introduces vector embedding models and the intuition behind the technologies we can use to build one ourselves.<br><br><a href="https://fullstackml.dev/p/5-lets-build-our-embedding-model">Part 2. Let's build our image embedding model</a> - (this article) - Shows a couple of ways to build embedding models - first by using a <strong>pre-trained model</strong>, and next by <strong>fine-tuning</strong> a pre-trained model. We use PyTorch to build our feature extractor.</p><p><a href="https://fullstackml.dev/p/6-using-metaflow-and-mlflow">Part 3. Modelling with Metaflow and MLFlow</a> - Here we are using Metaflow to build our model training workflow, where we introduce the concept of checkpointing, and MLFlow for experiment tracking.</p><p><a href="https://fullstackml.dev/p/8-from-training-to-deployment-a-simple">Part 4. From Training to Deployment: A Simple Approach to Serving Embedding Models</a> -Packaging your ML model in a Docker container opens it up to a multitude of model serving options.</p><p><a href="https://fullstackml.dev/p/9-putting-our-bird-embedding-model">Part 5. Putting Our Bird Embedding Model to Work: Introducing the Web Frontend</a> -For our embedding model to prove useful to others, we have created a modern frontend to serve the similarity inference to our users.</p><div><hr></div><p>Hi friends, </p><p><a href="https://fullstackml.dev/p/4-so-how-do-you-build-an-image-embedding">In the last article</a>, we&#8217;ve been tasked by <strong>Bird Watch</strong> to build their bird species similarity search engine. We have a dataset of <strong>525 bird species</strong> downloaded from Kaggle. I swear I got it originally from Kaggle, however when preparing this article, I couldn&#8217;t find it there anymore. I did find a copy in Hugging Face, so <a href="http://yashikota/birds-525-species-image-classification">here it is</a>.</p><p>A similarity search engine operates by extracting embedding vectors from images and calculating the distances between them. The magnitude of these distances indicates the degree of similarity between the images. To create an effective embedding model, it's essential to select a suitable image model and decide whether to use the <strong>pre-trained model</strong> as-is or to <strong>fine-tune it</strong> for your specific application. Using either will leverage the model&#8217;s feature extraction capabilities.</p><p>Let&#8217;s begin.</p><h2>Option 1: Use a Pre-trained model</h2><p>The breakthrough performance of <a href="https://en.wikipedia.org/wiki/AlexNet">AlexNet</a> in 2012 opened the floodgates for deep learning in computer vision, leading to the discovery and development of numerous models over the years. These models not only improved accuracy but also became increasingly efficient. Notable examples include <a href="https://towardsdatascience.com/vgg-neural-networks-the-next-step-after-alexnet-3f91fa9ffe2c">VGGNet</a> (2014), <a href="https://en.wikipedia.org/wiki/Residual_neural_network">ResNet</a> (2015), <a href="https://arxiv.org/abs/1704.04861">MobileNet</a> (2017), and <a href="https://arxiv.org/abs/2010.11929">Vision Transformers (ViT)</a> (2020), all of which have pushed the boundaries of what's possible in image processing with deep learning.</p><p>For the purposes of this example, we will focus on a model from the ResNet family&#8212;specifically, <strong>ResNet-50</strong>. In my experience, <a href="https://huggingface.co/microsoft/resnet-50">ResNet-50</a> strikes an ideal balance between accuracy and computational efficiency, making it well-suited for this particular use case.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!2Z26!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3a7bd942-e97a-4e6f-9e30-88b3ee404e13_1000x313.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!2Z26!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3a7bd942-e97a-4e6f-9e30-88b3ee404e13_1000x313.jpeg 424w, https://substackcdn.com/image/fetch/$s_!2Z26!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3a7bd942-e97a-4e6f-9e30-88b3ee404e13_1000x313.jpeg 848w, https://substackcdn.com/image/fetch/$s_!2Z26!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3a7bd942-e97a-4e6f-9e30-88b3ee404e13_1000x313.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!2Z26!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3a7bd942-e97a-4e6f-9e30-88b3ee404e13_1000x313.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!2Z26!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3a7bd942-e97a-4e6f-9e30-88b3ee404e13_1000x313.jpeg" width="1000" height="313" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/3a7bd942-e97a-4e6f-9e30-88b3ee404e13_1000x313.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:313,&quot;width&quot;:1000,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:40114,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpeg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!2Z26!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3a7bd942-e97a-4e6f-9e30-88b3ee404e13_1000x313.jpeg 424w, https://substackcdn.com/image/fetch/$s_!2Z26!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3a7bd942-e97a-4e6f-9e30-88b3ee404e13_1000x313.jpeg 848w, https://substackcdn.com/image/fetch/$s_!2Z26!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3a7bd942-e97a-4e6f-9e30-88b3ee404e13_1000x313.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!2Z26!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3a7bd942-e97a-4e6f-9e30-88b3ee404e13_1000x313.jpeg 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>ResNet General Architectural Design</p><h3>Follow with the Jupyter Notebook</h3><p><a href="https://github.com/jaeyow/bird-species-embedding-model/blob/main/notebooks/01-using-a-pre-trained-model.ipynb">Follow the scripts in a Jupyter notebook here</a>.</p><p>Using a pre-trained model as an embedding model is the simplest and most efficient option. For instance, in our case, ResNet-50 has been pre-trained on the ImageNet dataset and includes a classifier head with 1000 classes. To create an embedding model, you only need to <strong>remove the final classification layer</strong>, transforming it into an embedding model that's ready to use for our bird species similarity search engine.</p><h3>Similarity Search with pre-trained ResNet-50</h3><p>In the following images, we used the pre-trained model as an embedding model. While it works to some extent, its performance isn't optimal. For example, it performed well for the <a href="https://en.wikipedia.org/wiki/Philippine_eagle">Philippine Eagle</a> and <a href="https://en.wikipedia.org/wiki/Great_tit">Parus Major</a> but struggled with the <a href="https://en.wikipedia.org/wiki/Scarlet_tanager">Scarlet Tanager</a>. Additionally, if you examine their similarity scores, you'll notice that the bird species are clustered too closely together. This is expected, as the ResNet-50 model was pre-trained on a 1,000-class dataset, including many categories beyond birds, which affects its ability to distinguish between different bird species in fine detail.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!eqNU!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F96a56461-7992-40c4-b70c-28570243af36_1225x1123.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!eqNU!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F96a56461-7992-40c4-b70c-28570243af36_1225x1123.png 424w, https://substackcdn.com/image/fetch/$s_!eqNU!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F96a56461-7992-40c4-b70c-28570243af36_1225x1123.png 848w, https://substackcdn.com/image/fetch/$s_!eqNU!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F96a56461-7992-40c4-b70c-28570243af36_1225x1123.png 1272w, https://substackcdn.com/image/fetch/$s_!eqNU!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F96a56461-7992-40c4-b70c-28570243af36_1225x1123.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!eqNU!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F96a56461-7992-40c4-b70c-28570243af36_1225x1123.png" width="1225" height="1123" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/96a56461-7992-40c4-b70c-28570243af36_1225x1123.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1123,&quot;width&quot;:1225,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1248999,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!eqNU!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F96a56461-7992-40c4-b70c-28570243af36_1225x1123.png 424w, https://substackcdn.com/image/fetch/$s_!eqNU!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F96a56461-7992-40c4-b70c-28570243af36_1225x1123.png 848w, https://substackcdn.com/image/fetch/$s_!eqNU!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F96a56461-7992-40c4-b70c-28570243af36_1225x1123.png 1272w, https://substackcdn.com/image/fetch/$s_!eqNU!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F96a56461-7992-40c4-b70c-28570243af36_1225x1123.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!DdvD!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F25bf5ee7-32e2-43bb-9368-4f2b400561b1_1243x1140.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!DdvD!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F25bf5ee7-32e2-43bb-9368-4f2b400561b1_1243x1140.png 424w, https://substackcdn.com/image/fetch/$s_!DdvD!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F25bf5ee7-32e2-43bb-9368-4f2b400561b1_1243x1140.png 848w, https://substackcdn.com/image/fetch/$s_!DdvD!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F25bf5ee7-32e2-43bb-9368-4f2b400561b1_1243x1140.png 1272w, https://substackcdn.com/image/fetch/$s_!DdvD!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F25bf5ee7-32e2-43bb-9368-4f2b400561b1_1243x1140.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!DdvD!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F25bf5ee7-32e2-43bb-9368-4f2b400561b1_1243x1140.png" width="1243" height="1140" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/25bf5ee7-32e2-43bb-9368-4f2b400561b1_1243x1140.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1140,&quot;width&quot;:1243,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1193566,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!DdvD!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F25bf5ee7-32e2-43bb-9368-4f2b400561b1_1243x1140.png 424w, https://substackcdn.com/image/fetch/$s_!DdvD!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F25bf5ee7-32e2-43bb-9368-4f2b400561b1_1243x1140.png 848w, https://substackcdn.com/image/fetch/$s_!DdvD!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F25bf5ee7-32e2-43bb-9368-4f2b400561b1_1243x1140.png 1272w, https://substackcdn.com/image/fetch/$s_!DdvD!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F25bf5ee7-32e2-43bb-9368-4f2b400561b1_1243x1140.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!3krr!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F78dab463-1fc0-4790-abb6-0f825e7d7fa1_1250x1157.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!3krr!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F78dab463-1fc0-4790-abb6-0f825e7d7fa1_1250x1157.png 424w, https://substackcdn.com/image/fetch/$s_!3krr!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F78dab463-1fc0-4790-abb6-0f825e7d7fa1_1250x1157.png 848w, https://substackcdn.com/image/fetch/$s_!3krr!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F78dab463-1fc0-4790-abb6-0f825e7d7fa1_1250x1157.png 1272w, https://substackcdn.com/image/fetch/$s_!3krr!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F78dab463-1fc0-4790-abb6-0f825e7d7fa1_1250x1157.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!3krr!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F78dab463-1fc0-4790-abb6-0f825e7d7fa1_1250x1157.png" width="1250" height="1157" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/78dab463-1fc0-4790-abb6-0f825e7d7fa1_1250x1157.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1157,&quot;width&quot;:1250,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1136248,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!3krr!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F78dab463-1fc0-4790-abb6-0f825e7d7fa1_1250x1157.png 424w, https://substackcdn.com/image/fetch/$s_!3krr!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F78dab463-1fc0-4790-abb6-0f825e7d7fa1_1250x1157.png 848w, https://substackcdn.com/image/fetch/$s_!3krr!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F78dab463-1fc0-4790-abb6-0f825e7d7fa1_1250x1157.png 1272w, https://substackcdn.com/image/fetch/$s_!3krr!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F78dab463-1fc0-4790-abb6-0f825e7d7fa1_1250x1157.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h3>Issues with the pre-trained model</h3><p>Using the ResNet-50 pre-trained embedding model presents a few key issues:</p><ol><li><p><strong>Similarity scores are too close together</strong>: The model fails to adequately distinguish between bird species in the dataset, with all similarity scores falling within the 0.8xx range. This lack of variation suggests the model isn't capturing enough nuanced differences between species.</p></li><li><p><strong>Sensitivity to image variations</strong>: The model is highly sensitive to changes in image orientation, rotation and lighting. For instance, showing it different variations of the same bird species&#8212;such as flipped or rotated images or different lighting conditions&#8212;leads to inconsistent and poor results, indicating a lack of robustness in handling such variations.</p></li></ol><h2>Option 2: Fine-tune a pre-trained model</h2><p>Another approach is to fine-tune the ResNet-50 model. This means starting with the pre-trained ResNet-50, leveraging the weights it learned during its initial training. Fine-tuning offers a significant advantage over training from scratch, which would require hundreds of hours and millions of images&#8212;resources we don&#8217;t have, nor need. Instead, fine-tuning allows us to quickly adapt the model for our specific use case, essentially taking a shortcut to build an effective embedding model.</p><p>Fine-tuning an embedding model follows a process similar to training a classifier. In our case, since we are working with a smaller dataset of 50 classes, we start the process as if we are training a 50-class classifier. After training, we simply remove the final classifier layer, converting the model into an <strong>embedding model</strong> ready to handle our similarity search use case.</p><h3>Follow with the Jupyter Notebook</h3><p><a href="https://github.com/jaeyow/bird-species-embedding-model/blob/main/notebooks/02-fine-tuning-a-pre-trained-model.ipynb">Here is the Jupyter Notebook for fine-tuning a pre-trained ResNet-50 model.</a></p><p>In the images below, we used the same set as in the pre-trained experiment, but this time we generated embeddings using the <strong>fine-tuned ResNet-50 model</strong>. The results show a drastic improvement, both in accuracy and the returned similarity scores.</p><h3>Similarity Search with fine-tuned ResNet-50</h3><p>All unknown birds were matched correctly as the top result. Moreover, the range of similarity scores is much more promising&#8212;where the top match has a similarity score close to 1, and the remaining results have scores closer to zero. This demonstrates that the fine-tuned model captures bird species embeddings with far greater precision.</p><p>For example, the unknown birds&#8212;<a href="https://en.wikipedia.org/wiki/Philippine_eagle">Philippine Eagle</a>, <a href="https://en.wikipedia.org/wiki/Scarlet_tanager">Scarlet Tanager</a>, and <a href="https://en.wikipedia.org/wiki/Great_tit">Parus Major</a>&#8212;all returned the correct match as the top result! It's clear we've built a highly effective bird species embedding model, and impressively, the model is just under 100MB in size.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!ZtD5!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F55981189-6e55-4627-9beb-33ef1a6d701d_1234x1137.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!ZtD5!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F55981189-6e55-4627-9beb-33ef1a6d701d_1234x1137.png 424w, https://substackcdn.com/image/fetch/$s_!ZtD5!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F55981189-6e55-4627-9beb-33ef1a6d701d_1234x1137.png 848w, https://substackcdn.com/image/fetch/$s_!ZtD5!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F55981189-6e55-4627-9beb-33ef1a6d701d_1234x1137.png 1272w, https://substackcdn.com/image/fetch/$s_!ZtD5!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F55981189-6e55-4627-9beb-33ef1a6d701d_1234x1137.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!ZtD5!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F55981189-6e55-4627-9beb-33ef1a6d701d_1234x1137.png" width="1234" height="1137" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/55981189-6e55-4627-9beb-33ef1a6d701d_1234x1137.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1137,&quot;width&quot;:1234,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1176403,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!ZtD5!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F55981189-6e55-4627-9beb-33ef1a6d701d_1234x1137.png 424w, https://substackcdn.com/image/fetch/$s_!ZtD5!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F55981189-6e55-4627-9beb-33ef1a6d701d_1234x1137.png 848w, https://substackcdn.com/image/fetch/$s_!ZtD5!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F55981189-6e55-4627-9beb-33ef1a6d701d_1234x1137.png 1272w, https://substackcdn.com/image/fetch/$s_!ZtD5!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F55981189-6e55-4627-9beb-33ef1a6d701d_1234x1137.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!pL4n!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0385596e-ff82-4ce2-ba6b-f5076c2ad623_1225x1139.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!pL4n!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0385596e-ff82-4ce2-ba6b-f5076c2ad623_1225x1139.png 424w, https://substackcdn.com/image/fetch/$s_!pL4n!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0385596e-ff82-4ce2-ba6b-f5076c2ad623_1225x1139.png 848w, https://substackcdn.com/image/fetch/$s_!pL4n!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0385596e-ff82-4ce2-ba6b-f5076c2ad623_1225x1139.png 1272w, https://substackcdn.com/image/fetch/$s_!pL4n!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0385596e-ff82-4ce2-ba6b-f5076c2ad623_1225x1139.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!pL4n!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0385596e-ff82-4ce2-ba6b-f5076c2ad623_1225x1139.png" width="1225" height="1139" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/0385596e-ff82-4ce2-ba6b-f5076c2ad623_1225x1139.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1139,&quot;width&quot;:1225,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1186674,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!pL4n!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0385596e-ff82-4ce2-ba6b-f5076c2ad623_1225x1139.png 424w, https://substackcdn.com/image/fetch/$s_!pL4n!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0385596e-ff82-4ce2-ba6b-f5076c2ad623_1225x1139.png 848w, https://substackcdn.com/image/fetch/$s_!pL4n!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0385596e-ff82-4ce2-ba6b-f5076c2ad623_1225x1139.png 1272w, https://substackcdn.com/image/fetch/$s_!pL4n!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0385596e-ff82-4ce2-ba6b-f5076c2ad623_1225x1139.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!0xZo!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3df57700-1d2b-40a5-8a28-cee91f33e31c_1248x1153.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!0xZo!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3df57700-1d2b-40a5-8a28-cee91f33e31c_1248x1153.png 424w, https://substackcdn.com/image/fetch/$s_!0xZo!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3df57700-1d2b-40a5-8a28-cee91f33e31c_1248x1153.png 848w, https://substackcdn.com/image/fetch/$s_!0xZo!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3df57700-1d2b-40a5-8a28-cee91f33e31c_1248x1153.png 1272w, https://substackcdn.com/image/fetch/$s_!0xZo!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3df57700-1d2b-40a5-8a28-cee91f33e31c_1248x1153.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!0xZo!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3df57700-1d2b-40a5-8a28-cee91f33e31c_1248x1153.png" width="1248" height="1153" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/3df57700-1d2b-40a5-8a28-cee91f33e31c_1248x1153.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1153,&quot;width&quot;:1248,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1159327,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!0xZo!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3df57700-1d2b-40a5-8a28-cee91f33e31c_1248x1153.png 424w, https://substackcdn.com/image/fetch/$s_!0xZo!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3df57700-1d2b-40a5-8a28-cee91f33e31c_1248x1153.png 848w, https://substackcdn.com/image/fetch/$s_!0xZo!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3df57700-1d2b-40a5-8a28-cee91f33e31c_1248x1153.png 1272w, https://substackcdn.com/image/fetch/$s_!0xZo!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3df57700-1d2b-40a5-8a28-cee91f33e31c_1248x1153.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h2>Conclusion</h2><p>While the <strong>Bird Watch</strong> project is hypothetical, the techniques we've explored in using embedding models&#8212;both pre-trained and fine-tuned&#8212;are very much applicable in the real world. The <strong>ResNet-50</strong> model we utilized was introduced by <a href="https://arxiv.org/pdf/1512.03385">Microsoft in 2015</a>, so it may not be considered cutting-edge technology. However, it's important to not underestimate the power of these machine learning models. With its ideal balance of accuracy and latency, these models continue to see widespread application in the industry today.</p><p>Although we are applying this technology in a conceptual project like our bird species similarity search application, the same methods continue to be used in critical areas such as <a href="https://towardsdatascience.com/using-resnets-to-detect-anomalies-in-industrial-iot-textile-production-42da77d49ad2">automated visual inspection and quality control</a> in manufacturing, as well as in <a href="https://pubmed.ncbi.nlm.nih.gov/37320940/">medical imaging for diagnosis and cancer detection</a>. </p><p>Till then,</p><p>JO</p><p></p>]]></content:encoded></item><item><title><![CDATA[#4 - So how do you build a vector embedding model?]]></title><description><![CDATA[This project explores a popular way to build a vector embedding model using your own custom dataset]]></description><link>https://www.fullstackml.dev/p/4-so-how-do-you-build-an-image-embedding</link><guid isPermaLink="false">https://www.fullstackml.dev/p/4-so-how-do-you-build-an-image-embedding</guid><dc:creator><![CDATA[JO Reyes]]></dc:creator><pubDate>Sun, 29 Sep 2024 23:01:18 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!kiHG!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6342f4f8-736e-455c-a911-98ec65ee84c9_2038x1394.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>This article is <strong>part of a blog series about demistifying vector embedding models</strong> for use in image embeddings:</p><p><a href="https://fullstackml.dev/p/4-so-how-do-you-build-an-image-embedding">Part 1. So how do you build a vector embedding model?</a> - (this article) - Introduces vector embedding models and the intuition behind the technologies we can use to build one ourselves.<br><br><a href="https://fullstackml.dev/p/5-lets-build-our-embedding-model">Part 2. Let's build our image embedding model</a> - Shows a couple of ways to build embedding models - first by using a <strong>pre-trained model</strong>, and next by <strong>fine-tuning</strong> a pre-trained model. We use PyTorch to build our feature extractor.</p><p><a href="https://fullstackml.dev/p/6-using-metaflow-and-mlflow">Part 3. Modelling with Metaflow and MLFlow</a> - Here we are using Metaflow to build our model training workflow, where we introduce the concept of checkpointing, and MLFlow for experiment tracking.</p><p><a href="https://fullstackml.dev/p/8-from-training-to-deployment-a-simple">Part 4. From Training to Deployment: A Simple Approach to Serving Embedding Models</a> -Packaging your ML model in a Docker container opens it up to a multitude of model serving options.</p><p><a href="https://fullstackml.dev/p/9-putting-our-bird-embedding-model">Part 5. Putting Our Bird Embedding Model to Work: Introducing the Web Frontend</a> -For our embedding model to prove useful to others, we have created a modern frontend to serve the similarity inference to our users.</p><div><hr></div><p>Hi friends, </p><p>It&#8217;s been an eventful few months at work since my last post, where I wrote about the time when I spent the <a href="https://fullstackml.dev/p/3-weekend-fun-with-soms-fastapi-and">weekend playing with SOMs</a>. When I&#8217;m less busy work-wise, I do tend to work on interesting end-to-end projects that have something to do with machine learning. I did tell myself that for 2024, Gen AI will be my focus, and it still is, however, I will still have time for ML projects such as this.</p><p>One problem that stumped me recently was an image classification problem for 5,000+ classes. Yeah, I&#8217;ve done multiple classification problems before but not with more than a dozen classes. I also have not done a lot of computer vision ML problems so this is my chance to dig a little bit deeper and familiarise myself with the process. </p><h2>A Hypothetical Problem</h2><p>I am partnering with <strong>Bird Watch</strong>, a bird conservation and wildlife monitoring non-profit organization. They need a machine learning model that will help them identify bird species that had been captured by their drones and <a href="https://www.allaboutbirds.org/news/camera-traps-are-like-candid-camera-for-your-backyard-birds/">camera traps</a>. Instead of directly classifying each bird species, I was asked to return the 10 most similar bird species, returned in order according to the most similar where naturally the most similar specie will be at the top of that list. So this is good, instead of forcing the output into predefined categories, it can actually find matches when the species presented is not part of the dataset, and can be a helpful in identifying new species. </p><ul><li><p>multiple class classification of this magnitude is not recommended as it will be a large model and will be expensive to train</p></li><li><p>after reflecting on the problem, where we need to return the ten most similar bird species. This was not really a classification problem after all</p></li><li><p>input is an image, and the model will need to return the 10 most similar images from the images in a database. </p></li><li><p>so instead of an ML problem, we can actually treat this as a data retrieval problem, with the help of embedding vectors and a vector database</p></li><li><p>So if it&#8217;s similarity search, as what we are doing here, what technology enables it? Yes, semantic search and vector databases. </p></li><li><p>And when talking about vector databases, the thing that underpins it all is vector embeddings generated by an <strong>embedding model</strong>.</p></li></ul><h2>So, I don&#8217;t need a classification model?</h2><p>A typical bird classification model will return with a species prediction, one out of the <strong>525 bird species</strong> in the dataset. But what do I do if I don&#8217;t want to predict the species? What machine learning model do I need when I want to return the 10 most similar species? We need a <strong>species similarity search</strong>, and not a species classifier!</p><p>We need to build a <strong>vector embedding model</strong>! </p><p>Although vector embeddings have been around <a href="https://medium.com/@PolonioliAI/vectors-in-search-a-brief-history-8ba4a4586d23">since the 1980s and 1990s</a>, I only recently became aware of them, thanks to their rise in popularity with large language models like ChatGPT towards the end of 2022. </p><h2>How to build an embedding model</h2><p>To know how to produce an embedding model, we first need to understand the main parts of a <strong>Convolutional Neural Network (CNN)</strong> model. The CNN model has three main parts:</p><ul><li><p><strong>Backbone</strong> - This is the part of the model that extracts the features from the image. This is typically the convolutional layers of the model.</p></li><li><p><strong>Neck</strong> - This is the part of the model that reduces the dimensionality of the features extracted by the backbone. This is typically the pooling layers of the model. This is typically the penultimate layer of the model, that outputs the features.</p></li><li><p><strong>Head</strong> - This is the part of the model that classifies the features extracted by the backbone and refined by the neck.</p></li></ul><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!kiHG!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6342f4f8-736e-455c-a911-98ec65ee84c9_2038x1394.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!kiHG!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6342f4f8-736e-455c-a911-98ec65ee84c9_2038x1394.png 424w, https://substackcdn.com/image/fetch/$s_!kiHG!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6342f4f8-736e-455c-a911-98ec65ee84c9_2038x1394.png 848w, https://substackcdn.com/image/fetch/$s_!kiHG!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6342f4f8-736e-455c-a911-98ec65ee84c9_2038x1394.png 1272w, https://substackcdn.com/image/fetch/$s_!kiHG!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6342f4f8-736e-455c-a911-98ec65ee84c9_2038x1394.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!kiHG!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6342f4f8-736e-455c-a911-98ec65ee84c9_2038x1394.png" width="1456" height="996" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/6342f4f8-736e-455c-a911-98ec65ee84c9_2038x1394.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:996,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:468729,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!kiHG!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6342f4f8-736e-455c-a911-98ec65ee84c9_2038x1394.png 424w, https://substackcdn.com/image/fetch/$s_!kiHG!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6342f4f8-736e-455c-a911-98ec65ee84c9_2038x1394.png 848w, https://substackcdn.com/image/fetch/$s_!kiHG!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6342f4f8-736e-455c-a911-98ec65ee84c9_2038x1394.png 1272w, https://substackcdn.com/image/fetch/$s_!kiHG!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6342f4f8-736e-455c-a911-98ec65ee84c9_2038x1394.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>This image is true for many of the image models hosted in Hugging Face. To produce an embedding model, we need to <strong>remove the head</strong> and use the <strong>neck's output</strong> which I later realised is actually are THE vector embeddings! These embeddings are the features extracted by the backbone and refined by the neck.</p><p>It is as easy as that. <strong>Just remove the classification head</strong>, regardless if you are using a pre-trained model or a fine-tuned model, and you have just produced an embedding model.</p><p><strong>That is exactly what we will do, perform a &#8216;surgery&#8217; on the fine-tuned model that we will build for this article!</strong></p><h2>Pre-trained image model</h2><p>There are a multitude of options of pre-trained image models available online - <a href="https://www.tensorflow.org/hub">Tensorflow</a> and <a href="https://pytorch.org/hub/">PyTorch</a> have their model hubs, and cloud providers like AWS, Google and Azure als have their own AI galleries. For this article however, we will be using <a href="https://huggingface.co/docs/timm/index">timm (Py</a><strong><a href="https://huggingface.co/docs/timm/index">T</a></strong><a href="https://huggingface.co/docs/timm/index">orch </a><strong><a href="https://huggingface.co/docs/timm/index">Im</a></strong><a href="https://huggingface.co/docs/timm/index">age </a><strong><a href="https://huggingface.co/docs/timm/index">M</a></strong><a href="https://huggingface.co/docs/timm/index">odule)</a>, which is a PyTorch library that contains a collection (over 1,400) of models, and miscellaneous code that can help you build, train or fine tune your image models. When you use this library, it actually downloads the model weights from <a href="https://huggingface.co/">Hugging Face</a>, another well known model hub that initially became popular through its hosting of NLP models. These days, you will find there almost any type of open source machine learning model to fine-tune or to use outright.</p><h2> Let&#8217;s check out ResNet50</h2><p>ResNet 50, short for <strong>Res</strong>idual <strong>Net</strong>work architecture, a deep learning machine learning model that is popular for image classification tasks. It was introduced by Microsoft Research in 2015, and it became the basis of many deep learning architectures after that. It was <a href="https://arxiv.org/pdf/1512.03385">first introduced in this paper by He. et. al</a>.</p><h2>Feature extractor</h2><p>ResNet 50 is part of the ResNet family of image models, and has been proven for its performance as pre-trained models, in a process called <a href="https://en.wikipedia.org/wiki/Transfer_learning">transfer learning</a> - where it learns weights from large datasets, like ImageNet, and uses that to improve the accuracy in much smaller datasets, like our bird species dataset.</p><p>For our embedding model, although we will start by training a ResNet 50 classifier, once we have built our fine-tuned model, we will be discarding the classification head, leaving just the embedding model, in other words, we&#8217;re left with the feature extractor, and that&#8217;s precisely what we need. </p><h2>And how about our similarity search?</h2><p>Once a vector embedding model has been built, we&#8217;re halfway to our bird species similarity search engine. We can then build it in the next 3 steps:</p><ul><li><p>ingest our bird species dataset into a vector database, using the fine-tuned embedding model to generate the embedding of each</p></li><li><p>using the same embedding model, get the vector embedding of the new bird specie image</p></li><li><p>finally, perform a similarity search of this new bird specie image against all the bird species ingested in the vector database. There are several options of similarity criteria, but the most common are <a href="https://en.wikipedia.org/wiki/Cosine_similarity">cosine similarity</a> and <a href="https://en.wikipedia.org/wiki/Euclidean_distance">euclidian distance</a>. These similarity metrics are typical of the popular vector databases these days.</p></li></ul><h2>Conclusion</h2><p>In this article, we have learned that the image models like <strong>ResNet 50</strong> are used to train and fine-tune classification models. But to produce an embedding model, <strong>we can simply remove the model&#8217;s classification head</strong>, leaving behind the feature extractor and a model that produces embedding vectors - our fine-tuned embedding model.</p><p>In the next article, we&#8217;ll see how to actually build this embedding model, through a Jupyter notebook that we&#8217;ll build for this series. </p><p>Till then,</p><p>JO</p><p></p><p></p><p></p>]]></content:encoded></item><item><title><![CDATA[#3 - Weekend fun with SOMs, FastAPI and AWS Lambda Container Images]]></title><description><![CDATA[Today we take a quick pit stop from our F1 Prediction project, and explore a fun little algorithm with our very own python implementation of Kohonen Self-Organising Maps]]></description><link>https://www.fullstackml.dev/p/3-weekend-fun-with-soms-fastapi-and</link><guid isPermaLink="false">https://www.fullstackml.dev/p/3-weekend-fun-with-soms-fastapi-and</guid><dc:creator><![CDATA[JO Reyes]]></dc:creator><pubDate>Sat, 30 Mar 2024 07:31:09 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!ibpJ!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5eb89ab0-fc8d-41b9-bfdf-5e1e66a1b26b_730x732.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Hi friends,</p><p>A couple of weeks ago, I stumbled upon an unsupervised learning algorithm called Kohonen Self-Organising Maps. Koho-wwhattt?!!</p><p><a href="https://github.com/jaeyow/kohonen-som/tree/master">Github project for this article</a></p><p>Self-organising maps are nothing novel, in fact it was first introduced by a Finnish professor <a href="https://en.wikipedia.org/wiki/Teuvo_Kohonen">Teuvo Kohonen</a> in the 1980s. It is a type of artificial neural network, but instead of the usual <a href="https://en.wikipedia.org/wiki/Backpropagation">backpropagation</a> (gradient descent) used in neural networks, Kohonen SOMs use <a href="https://en.wikipedia.org/wiki/Competitive_learning">competitive learning</a> instead. It was quite popular around 2005, however, I seldom (never) hear them being used these days. There are SOM libraries in popular languages like <a href="https://cran.r-project.org/web/packages/som/index.html">Python</a>, <a href="https://cran.r-project.org/web/packages/som/index.html">R</a>, and <a href="https://au.mathworks.com/help/deeplearning/ref/selforgmap.html">Matlab</a> to name a few.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!ibpJ!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5eb89ab0-fc8d-41b9-bfdf-5e1e66a1b26b_730x732.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!ibpJ!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5eb89ab0-fc8d-41b9-bfdf-5e1e66a1b26b_730x732.png 424w, https://substackcdn.com/image/fetch/$s_!ibpJ!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5eb89ab0-fc8d-41b9-bfdf-5e1e66a1b26b_730x732.png 848w, https://substackcdn.com/image/fetch/$s_!ibpJ!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5eb89ab0-fc8d-41b9-bfdf-5e1e66a1b26b_730x732.png 1272w, https://substackcdn.com/image/fetch/$s_!ibpJ!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5eb89ab0-fc8d-41b9-bfdf-5e1e66a1b26b_730x732.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!ibpJ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5eb89ab0-fc8d-41b9-bfdf-5e1e66a1b26b_730x732.png" width="544" height="545.4904109589041" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/5eb89ab0-fc8d-41b9-bfdf-5e1e66a1b26b_730x732.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:732,&quot;width&quot;:730,&quot;resizeWidth&quot;:544,&quot;bytes&quot;:169160,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!ibpJ!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5eb89ab0-fc8d-41b9-bfdf-5e1e66a1b26b_730x732.png 424w, https://substackcdn.com/image/fetch/$s_!ibpJ!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5eb89ab0-fc8d-41b9-bfdf-5e1e66a1b26b_730x732.png 848w, https://substackcdn.com/image/fetch/$s_!ibpJ!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5eb89ab0-fc8d-41b9-bfdf-5e1e66a1b26b_730x732.png 1272w, https://substackcdn.com/image/fetch/$s_!ibpJ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5eb89ab0-fc8d-41b9-bfdf-5e1e66a1b26b_730x732.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p></p><p>What attracted me to SOMs was it&#8217;s ability to create pretty abstract blobs of colour which are almost like art, if you ask me. (This colorful representation is possible if you use a 3-component vector like the RGB-component we use in this article.) Above is one such example of an output generated by our Python implementation.</p><h2>What are SOMs?</h2><p><a href="https://en.wikipedia.org/wiki/Self-organizing_map">Kohonen Self-Organising Maps (SOM)</a> are a type of unsupervised learning algorithm. Kohonen Maps are typically used for clustering and visualising so that higher-dimensional data can be represented in lower dimensions, typically in 2D like in a rectangular topology or grid.</p><p>In addition to segmentation and clustering analysis, it is also a form of dimensionality reduction technique so that the high-dimensional data in the input layer can be represented in the output grid.</p><h2>SOM Training</h2><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!z5j-!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd0520d38-e929-4c38-966c-e8b431270b21_420x302.gif" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!z5j-!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd0520d38-e929-4c38-966c-e8b431270b21_420x302.gif 424w, https://substackcdn.com/image/fetch/$s_!z5j-!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd0520d38-e929-4c38-966c-e8b431270b21_420x302.gif 848w, https://substackcdn.com/image/fetch/$s_!z5j-!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd0520d38-e929-4c38-966c-e8b431270b21_420x302.gif 1272w, https://substackcdn.com/image/fetch/$s_!z5j-!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd0520d38-e929-4c38-966c-e8b431270b21_420x302.gif 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!z5j-!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd0520d38-e929-4c38-966c-e8b431270b21_420x302.gif" width="420" height="302" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/d0520d38-e929-4c38-966c-e8b431270b21_420x302.gif&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:302,&quot;width&quot;:420,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:22157,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/gif&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!z5j-!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd0520d38-e929-4c38-966c-e8b431270b21_420x302.gif 424w, https://substackcdn.com/image/fetch/$s_!z5j-!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd0520d38-e929-4c38-966c-e8b431270b21_420x302.gif 848w, https://substackcdn.com/image/fetch/$s_!z5j-!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd0520d38-e929-4c38-966c-e8b431270b21_420x302.gif 1272w, https://substackcdn.com/image/fetch/$s_!z5j-!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd0520d38-e929-4c38-966c-e8b431270b21_420x302.gif 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p> I have created a <a href="https://github.com/jaeyow/kohonen-som/blob/master/kohonen.ipynb">notebook</a>, a <a href="https://github.com/jaeyow/kohonen-som/blob/master/app/algorithm/kohonen.py">Python implementation</a> and a <a href="https://github.com/jaeyow/kohonen-som/blob/master/app/app.py">FastAPI client</a> that I have made available in GitHub to use any which way you want. This <a href="https://github.com/jaeyow/kohonen-som/blob/master/kohonen.ipynb">Jupyter notebook</a> goes through the <a href="https://en.wikipedia.org/wiki/Self-organizing_map">training of self-organising maps</a> as shown below:</p><ol><li><p>Each node's weights are initialized.</p></li><li><p>We enumerate through the training data for kohonen_some number of iterations (repeating if necessary). The current value we are training against will be referred to as the <code>current input vector</code></p></li><li><p>Every node is examined to calculate which one's weights are most like the input vector. The winning node is commonly known as the Best Matching Unit (BMU).</p></li><li><p>The radius of the neighbourhood of the BMU is now calculated. This is a value that starts large, typically set to the 'radius' of the lattice, but diminishes each time-step. Any nodes found within this radius are deemed to be inside the BMU's neighbourhood.</p></li><li><p>Each neighbouring node's (the nodes found in step 4) weights are adjusted to make them more like the input vector. The closer a node is to the BMU, the more its weights get altered.</p></li><li><p>Go to step 2 until we've completed N iterations.</p></li></ol><h2>Kohonen Input and Output Layer</h2><p>A Kohonen SOM has two layers, the <strong>input layer</strong> and the <strong>output layer</strong>. The input layer is made up of the features of the data, while the output layer is made up of the nodes that will be trained to represent the input data. In this project, our input layer is made up of floats (which represent RGB colours), and the output layer is a 2D grid of nodes, each node also made up of RGB components.</p><p>The image below shows a very simple SOM with an input layer of 3 features and an output layer of 16 nodes. In this project, we only have 3 features (RGB colour components), however, you can have more features in the input layer. For example, if you are trying to segment customers, the features can be income, sex, age, race, etc.</p><p>With our input features (and output node weights) conveniently RGB components, we can easily visualise the output layer as a 2D grid of colours.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!DYL8!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F65872660-9ae2-4c44-b541-edcba3d421db_977x738.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!DYL8!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F65872660-9ae2-4c44-b541-edcba3d421db_977x738.png 424w, https://substackcdn.com/image/fetch/$s_!DYL8!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F65872660-9ae2-4c44-b541-edcba3d421db_977x738.png 848w, https://substackcdn.com/image/fetch/$s_!DYL8!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F65872660-9ae2-4c44-b541-edcba3d421db_977x738.png 1272w, https://substackcdn.com/image/fetch/$s_!DYL8!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F65872660-9ae2-4c44-b541-edcba3d421db_977x738.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!DYL8!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F65872660-9ae2-4c44-b541-edcba3d421db_977x738.png" width="977" height="738" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/65872660-9ae2-4c44-b541-edcba3d421db_977x738.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:738,&quot;width&quot;:977,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:167554,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!DYL8!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F65872660-9ae2-4c44-b541-edcba3d421db_977x738.png 424w, https://substackcdn.com/image/fetch/$s_!DYL8!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F65872660-9ae2-4c44-b541-edcba3d421db_977x738.png 848w, https://substackcdn.com/image/fetch/$s_!DYL8!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F65872660-9ae2-4c44-b541-edcba3d421db_977x738.png 1272w, https://substackcdn.com/image/fetch/$s_!DYL8!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F65872660-9ae2-4c44-b541-edcba3d421db_977x738.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>If you look at each node in the output layer, you will notice that there are 3 lines into each node. These lines represent the weight of each feature in the input layer. Each output node weight is also a 3-dimension vector, the same shape as the input layer, and also represent RGB dimension. When updating the node weights during training, the changes to the weight are easily visualised as colours.</p><h2>SOM Output</h2><p>Given the input layer made up of 20 random RGB colours, later you will notice that the output image will contain colour groups from this layer. </p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Ncqr!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2391cf3a-9601-4e65-afb8-8fb231ddb3ac_1094x156.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Ncqr!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2391cf3a-9601-4e65-afb8-8fb231ddb3ac_1094x156.png 424w, https://substackcdn.com/image/fetch/$s_!Ncqr!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2391cf3a-9601-4e65-afb8-8fb231ddb3ac_1094x156.png 848w, https://substackcdn.com/image/fetch/$s_!Ncqr!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2391cf3a-9601-4e65-afb8-8fb231ddb3ac_1094x156.png 1272w, https://substackcdn.com/image/fetch/$s_!Ncqr!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2391cf3a-9601-4e65-afb8-8fb231ddb3ac_1094x156.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Ncqr!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2391cf3a-9601-4e65-afb8-8fb231ddb3ac_1094x156.png" width="512" height="73.00914076782449" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/2391cf3a-9601-4e65-afb8-8fb231ddb3ac_1094x156.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:156,&quot;width&quot;:1094,&quot;resizeWidth&quot;:512,&quot;bytes&quot;:23543,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Ncqr!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2391cf3a-9601-4e65-afb8-8fb231ddb3ac_1094x156.png 424w, https://substackcdn.com/image/fetch/$s_!Ncqr!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2391cf3a-9601-4e65-afb8-8fb231ddb3ac_1094x156.png 848w, https://substackcdn.com/image/fetch/$s_!Ncqr!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2391cf3a-9601-4e65-afb8-8fb231ddb3ac_1094x156.png 1272w, https://substackcdn.com/image/fetch/$s_!Ncqr!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2391cf3a-9601-4e65-afb8-8fb231ddb3ac_1094x156.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><p>In the below left image, a 100x100 output layer is initialised with random colours. Running the SOM algorithm for 1000 iterations will produce the output on the right, where we can see roughly 20 colours grouped together. </p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!8uAe!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F329eb1e8-32b8-4dda-812a-0366f1382dde_2000x1000.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!8uAe!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F329eb1e8-32b8-4dda-812a-0366f1382dde_2000x1000.png 424w, https://substackcdn.com/image/fetch/$s_!8uAe!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F329eb1e8-32b8-4dda-812a-0366f1382dde_2000x1000.png 848w, https://substackcdn.com/image/fetch/$s_!8uAe!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F329eb1e8-32b8-4dda-812a-0366f1382dde_2000x1000.png 1272w, https://substackcdn.com/image/fetch/$s_!8uAe!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F329eb1e8-32b8-4dda-812a-0366f1382dde_2000x1000.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!8uAe!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F329eb1e8-32b8-4dda-812a-0366f1382dde_2000x1000.png" width="1456" height="728" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/329eb1e8-32b8-4dda-812a-0366f1382dde_2000x1000.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:728,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:751761,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!8uAe!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F329eb1e8-32b8-4dda-812a-0366f1382dde_2000x1000.png 424w, https://substackcdn.com/image/fetch/$s_!8uAe!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F329eb1e8-32b8-4dda-812a-0366f1382dde_2000x1000.png 848w, https://substackcdn.com/image/fetch/$s_!8uAe!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F329eb1e8-32b8-4dda-812a-0366f1382dde_2000x1000.png 1272w, https://substackcdn.com/image/fetch/$s_!8uAe!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F329eb1e8-32b8-4dda-812a-0366f1382dde_2000x1000.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h2>Vectorisation</h2><p>In my first attempt at implementing the Kohonen SOM algorithm, I used the typical Python nested loops following the <a href="https://github.com/jaeyow/kohonen-som/blob/master/kohonen.ipynb">algorithm to the letter</a>. However, I quickly realised that increasing the iterations to 200, 500, 1000 or more would slow it to a crawl, not very exciting when deploying it to production.</p><p>The algorithm could be vectorised using <a href="https://numpy.org/doc/stable/index.html">Numpy</a>, which would make it more efficient, and faster. I have implemented both versions and compared the execution times.</p><p>The vectorised implementation is around <strong>76x</strong> faster (from around 10 minutes down to 9 seconds), so it&#8217;s worth remembering when working on your Python implementations that involve numerous loops.  </p><h2>Production deployment</h2><p>In the spirit of full-stack machine learning, we should not <a href="https://github.com/jaeyow/kohonen-som/blob/master/README.md#productionising-this-api">disregard production deployment</a> of the resulting API. This project uses <a href="https://fastapi.tiangolo.com/">FastAPI</a> to implement the web API to serve our ML model.</p><p>For this project, my deployment preference is serverless first, so I would use AWS Lambda Container Images to deploy it. As with many APIs for data and machine learning applications, the required dependencies will surely be over the 250MB uncompressed limit even when using multiple <a href="https://docs.aws.amazon.com/lambda/latest/dg/gettingstarted-concepts.html#gettingstarted-concepts-layer">Lambda Layers</a>. All the popular Python libraries are quite chunky and will easily go over this Lambda hard limit. Numpy alone is already over 100MB, and we still have FastAPI, Uvicorn, and Mangum to add.</p><p>With <a href="https://docs.aws.amazon.com/lambda/latest/dg/images-create.html">Lambda Container Images</a>, this will allow us up to 10GB container size limit for our API. I have supplied a Docker image for <a href="https://github.com/jaeyow/kohonen-som/blob/master/Dockerfile.aws.lambda">Lambda containers</a>, as well as a <a href="https://github.com/jaeyow/kohonen-som/blob/master/Dockerfile">standard image</a> for any cloud provider that can deploy containers. </p><p>However, for applications that have serious production requirements, a more comprehensive ML serving solution such as <a href="https://aws.amazon.com/sagemaker/">AWS SageMaker</a>, <a href="https://www.bentoml.com/">BentoML</a>, or <a href="https://www.tensorflow.org/tfx/guide/serving?hl=en">TensorFlow Serving</a> may be more appropriate. </p><p>Till then,</p><p>JO</p><h2>Handy Resources</h2><p><a href="https://github.com/jaeyow/kohonen-som/tree/master">Github project for this article</a></p><p><a href="https://danieldiamond.github.io/kohonen-som/">Kohonen Self Organising Map (SOM)</a></p><p><a href="https://sites.pitt.edu/~is2470pb/Spring05/FinalProjects/Group1a/tutorial/som.html">Self-Organising Maps</a> </p><p><a href="https://towardsdatascience.com/kohonen-self-organizing-maps-a29040d688da">Kohonen Self-Organising Maps - A special type of Artificial Neural Network</a> </p><p><a href="https://en.wikipedia.org/wiki/Self-organizing_map">Wikipedia - Self-organising map</a> </p>]]></content:encoded></item><item><title><![CDATA[#2 - Let's build a production-ready ML product]]></title><description><![CDATA[Join me in building it with a budget of - somewhere between free and cheap! &#129315;&#127950;&#65039;]]></description><link>https://www.fullstackml.dev/p/2-lets-build-an-ml-product</link><guid isPermaLink="false">https://www.fullstackml.dev/p/2-lets-build-an-ml-product</guid><dc:creator><![CDATA[JO Reyes]]></dc:creator><pubDate>Tue, 24 Oct 2023 22:01:12 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/6b40c3f5-7597-4ffe-843c-f7e2516d4a44_918x539.gif" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Hi friends,</p><p>Choosing the right project to work on can be as tricky as finding your way through a maze of endless options.</p><p>New AI announcements (especially Gen AI) seem to drop like clockwork these days, and the unlimited project choices can be <strong>overwhelming and paralyzing</strong>. If you know me, you'll know I'm a big fan of Formula 1. So, it wasn't hard when it came to picking my next machine learning project. I needed something that would be exciting to myself, a project I could pour endless hours into without blinking an eye.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!OXbF!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0d39bff3-f2d8-48b5-9ab8-373265a5ac34.avif" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!OXbF!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0d39bff3-f2d8-48b5-9ab8-373265a5ac34.avif 424w, https://substackcdn.com/image/fetch/$s_!OXbF!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0d39bff3-f2d8-48b5-9ab8-373265a5ac34.avif 848w, https://substackcdn.com/image/fetch/$s_!OXbF!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0d39bff3-f2d8-48b5-9ab8-373265a5ac34.avif 1272w, https://substackcdn.com/image/fetch/$s_!OXbF!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0d39bff3-f2d8-48b5-9ab8-373265a5ac34.avif 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!OXbF!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0d39bff3-f2d8-48b5-9ab8-373265a5ac34.avif" width="997" height="561" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/0d39bff3-f2d8-48b5-9ab8-373265a5ac34.avif&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:561,&quot;width&quot;:997,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:88829,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/avif&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!OXbF!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0d39bff3-f2d8-48b5-9ab8-373265a5ac34.avif 424w, https://substackcdn.com/image/fetch/$s_!OXbF!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0d39bff3-f2d8-48b5-9ab8-373265a5ac34.avif 848w, https://substackcdn.com/image/fetch/$s_!OXbF!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0d39bff3-f2d8-48b5-9ab8-373265a5ac34.avif 1272w, https://substackcdn.com/image/fetch/$s_!OXbF!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0d39bff3-f2d8-48b5-9ab8-373265a5ac34.avif 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Formula 1 Racing (&#169; Formula 1 image)</figcaption></figure></div><p>Friends, we're about to dive into the exciting world of <strong>Formula 1 race prediction</strong>! </p><p>This isn't my first rodeo with race prediction: I dipped my toes into it during a Data Science boot camp I completed a while back. That experience taught me the ropes of ML model development, however, this time, it will be a full-on ML-based product. Buckle up as we delve into applied data science!<br><br>To start off, here are our requirements and constraints:</p><ul><li><p>serves race predictions using a production-ready (<strong>infinitely scaleable</strong>) architecture</p></li><li><p><strong>web application</strong> deployed on a modern stack</p></li><li><p>historical data comes from an <strong>API</strong>, and miscellaneous race information will be <strong>scraped</strong> from online source/s </p></li><li><p>using the <strong>FTI Pipeline Architecture</strong> as <a href="https://www.hopsworks.ai/post/mlops-to-ml-systems-with-fti-pipelines">described here</a>, we will have a <strong>feature pipeline</strong> that will be run on a specified schedule, to update racing data when they become available.</p></li><li><p><strong>data validation</strong> and <strong>integrity checks</strong> are part of our feature pipeline.</p></li><li><p>a <strong>training pipeline</strong> that will be run when new features become available. As this series is not intended as a tutorial in model development, we will be using a good model (deep learning or an ML model, well, I have not made up my mind yet), we will discuss model selection criteria as we progress further. </p></li><li><p>use a <strong>modern platform</strong> to track our training, model tuning, and models created by this pipeline</p></li><li><p>a <strong>batch inference pipeline</strong>, that runs when a new model is ready and populates a database (or simply a file in storage) with pre-calculated predictions</p></li><li><p>a race <strong>prediction API</strong> that returns these pre-calculated prediction results to our user interface</p></li><li><p>our system will also have model prediction performance <strong>monitoring</strong> built-in</p></li><li><p>use exclusively <strong>free* and serverless infrastructure</strong>. <br>*note: A generous free tier is more than enough for our use case. </p><div><hr></div></li></ul><h2>Tech Stack</h2><p>Looking at the tech stack dump below, it seems a lot, and it is, however, this is what you would typically find with the industry or with <strong>startups building MVPs</strong>. When you are building a product, you&#8217;ll find yourself cobbling different applications together, as there&#8217;s no ONE application that can do it all. </p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!7cRp!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F85e9f989-05ef-4aec-95af-e087ca23e734_822x482.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!7cRp!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F85e9f989-05ef-4aec-95af-e087ca23e734_822x482.png 424w, https://substackcdn.com/image/fetch/$s_!7cRp!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F85e9f989-05ef-4aec-95af-e087ca23e734_822x482.png 848w, https://substackcdn.com/image/fetch/$s_!7cRp!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F85e9f989-05ef-4aec-95af-e087ca23e734_822x482.png 1272w, https://substackcdn.com/image/fetch/$s_!7cRp!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F85e9f989-05ef-4aec-95af-e087ca23e734_822x482.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!7cRp!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F85e9f989-05ef-4aec-95af-e087ca23e734_822x482.png" width="822" height="482" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/85e9f989-05ef-4aec-95af-e087ca23e734_822x482.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:482,&quot;width&quot;:822,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:279785,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!7cRp!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F85e9f989-05ef-4aec-95af-e087ca23e734_822x482.png 424w, https://substackcdn.com/image/fetch/$s_!7cRp!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F85e9f989-05ef-4aec-95af-e087ca23e734_822x482.png 848w, https://substackcdn.com/image/fetch/$s_!7cRp!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F85e9f989-05ef-4aec-95af-e087ca23e734_822x482.png 1272w, https://substackcdn.com/image/fetch/$s_!7cRp!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F85e9f989-05ef-4aec-95af-e087ca23e734_822x482.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Don&#8217;t worry, we will go through each one as we use them in the project. Here are some descriptions of each tool:</p><ul><li><p><a href="https://www.python.org/">Python</a> - the bulk of this project will be written in Python, as it is easy to learn and maintain, and it&#8217;s such a rich ecosystem that it is the de-facto standard in machine learning.</p></li><li><p><a href="https://metaflow.org/">Metaflow</a> - all our Python workflows will be orchestrated by Metaflow, an open-source ML tool built and battle-tested in <a href="https://www.netflix.com/">Netflix</a>. I&#8217;ve been using it for a while now, and just love how I can easily transition from my laptop to the cloud with very minimal changes.</p></li><li><p><a href="https://vercel.com/solutions/nextjs">NextJS on Vercel</a> - ReactJS is great, but NextJS is even better. It is a superset of React and is designed for production as it is super fast. I wrote a <a href="https://fullstackdeveloper.tips/pre-render-strategies-with-nextjs/#">blog article about it a while back</a>, and it was already awesome then.</p></li><li><p><a href="https://www.crummy.com/software/BeautifulSoup/">Beautiful Soup</a> - Most of the dataset for the project comes from <a href="https://ergast.com/mrd/">Ergast.com</a>, however, we are on borrowed time as it has been deprecated, and will be available only up to the end of 2024. For the latest race results, we will be scraping it straight from the Formula 1 website. </p></li><li><p><a href="https://www.hopsworks.ai/">Hopsworks</a> - I have mentioned that we will be using the <a href="https://www.hopsworks.ai/post/mlops-to-ml-systems-with-fti-pipelines">FTI pipeline architecture</a>, the mental model from the Hopswroks team. This architecture revolves around the use of feature stores, and we will use them here.</p></li><li><p><a href="https://duckdb.org/">DuckDB</a> - we will be using DuckDB instead of Pandas to work with our data in SQL more easily. I will always take the chance to not have to use Pandas if I can help it. </p></li><li><p><a href="https://aws.amazon.com/lambda/features/">AWS Lambda</a> - since our model serving is quite simple, where we&#8217;re only really serving online predictions from our cached and pre-calculated results, we don&#8217;t need to go through the complexities of using something like <a href="https://docs.seldon.io/projects/seldon-core/en/latest/reference/concepts.html">Seldon</a>, for leveraging Kubernetes clusters.   </p></li><li><p><a href="https://www.comet.com/">Comet</a> - for experiment tracking and model registry, we can use trusty Comet, with its easy-to-use UI and Python API. </p></li><li><p><a href="https://greatexpectations.io/">Great Expectations</a> - to enforce data validations and ensure data quality as the data is ingested into our pipeline, we use GX. </p></li><li><p><a href="https://www.evidentlyai.com/">Evidently AI </a>- monitoring the models in production is made possible by Evidently AI, and it can be used to detect drift and help us decide whether to retrain our models or not. </p><p></p><blockquote><p><em>I&#8217;d like to think that our budget for this project - is anywhere between free and cheap!  </em>&#129315;<em><br><br>- JO </em></p></blockquote><div><hr></div></li></ul><h2>Architecture Diagram</h2><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!6gi_!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F60bcbb18-8f72-4657-9660-d0395c8cd513_918x539.gif" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!6gi_!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F60bcbb18-8f72-4657-9660-d0395c8cd513_918x539.gif 424w, https://substackcdn.com/image/fetch/$s_!6gi_!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F60bcbb18-8f72-4657-9660-d0395c8cd513_918x539.gif 848w, https://substackcdn.com/image/fetch/$s_!6gi_!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F60bcbb18-8f72-4657-9660-d0395c8cd513_918x539.gif 1272w, https://substackcdn.com/image/fetch/$s_!6gi_!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F60bcbb18-8f72-4657-9660-d0395c8cd513_918x539.gif 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!6gi_!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F60bcbb18-8f72-4657-9660-d0395c8cd513_918x539.gif" width="918" height="539" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/60bcbb18-8f72-4657-9660-d0395c8cd513_918x539.gif&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:539,&quot;width&quot;:918,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:4058370,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/gif&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!6gi_!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F60bcbb18-8f72-4657-9660-d0395c8cd513_918x539.gif 424w, https://substackcdn.com/image/fetch/$s_!6gi_!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F60bcbb18-8f72-4657-9660-d0395c8cd513_918x539.gif 848w, https://substackcdn.com/image/fetch/$s_!6gi_!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F60bcbb18-8f72-4657-9660-d0395c8cd513_918x539.gif 1272w, https://substackcdn.com/image/fetch/$s_!6gi_!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F60bcbb18-8f72-4657-9660-d0395c8cd513_918x539.gif 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>We now have the architecture diagram above, which we will use as a guide as we build this project. We might add changes here and there, but the main takeaway here is that instead of a single &#8216;end-to-end&#8216; pipeline, <strong>a production-ready machine learning system is a modular, easily maintainable, and scalable system</strong>. I&#8217;ve marked the ML tooling we plan to use to give a better idea of our end goal.</p><p>I think that should be about it for now. In the next edition, we will start building our workflows.</p><p>Till then,</p><p>JO</p><div><hr></div><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://www.fullstackml.dev/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading Full Stack ML! Subscribe for free to receive interesting and practical tips on machine learning.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div>]]></content:encoded></item><item><title><![CDATA[#1 - Building machine learning systems is hard]]></title><description><![CDATA[But it doesn't have to be that way...]]></description><link>https://www.fullstackml.dev/p/building-machine-learning-systems</link><guid isPermaLink="false">https://www.fullstackml.dev/p/building-machine-learning-systems</guid><dc:creator><![CDATA[JO Reyes]]></dc:creator><pubDate>Tue, 17 Oct 2023 22:00:15 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/1d383913-6607-4ab1-9313-7ff7c038f2d7_775x442.gif" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Hi friends,<br><br>Building machine learning systems is hard.</p><p>Software development is a complex process, where development teams have to juggle creativity, customer requirements, technology, and many delivery-related issues. Furthermore, in the last few years, more and more products have incorporated machine learning, making what was already a complex undertaking even more challenging.</p><p>In one of my <a href="https://fullstackdeveloper.tips/accelerate-ml-application-development-in-aws/#">first articles on machine learning</a>, I used the following image to describe a machine learning product (copyright <a href="https://cloud.google.com/architecture/mlops-continuous-delivery-and-automation-pipelines-in-machine-learning">Google</a>), showing all its components, and how model development plays a very minor role. How can one build a product given the image below?</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!GNNo!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F79b5ade8-c2bd-4415-8718-6063b2ce826c_1318x546.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!GNNo!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F79b5ade8-c2bd-4415-8718-6063b2ce826c_1318x546.png 424w, https://substackcdn.com/image/fetch/$s_!GNNo!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F79b5ade8-c2bd-4415-8718-6063b2ce826c_1318x546.png 848w, https://substackcdn.com/image/fetch/$s_!GNNo!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F79b5ade8-c2bd-4415-8718-6063b2ce826c_1318x546.png 1272w, https://substackcdn.com/image/fetch/$s_!GNNo!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F79b5ade8-c2bd-4415-8718-6063b2ce826c_1318x546.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!GNNo!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F79b5ade8-c2bd-4415-8718-6063b2ce826c_1318x546.png" width="718" height="297.44157814871016" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/79b5ade8-c2bd-4415-8718-6063b2ce826c_1318x546.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:546,&quot;width&quot;:1318,&quot;resizeWidth&quot;:718,&quot;bytes&quot;:21769,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!GNNo!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F79b5ade8-c2bd-4415-8718-6063b2ce826c_1318x546.png 424w, https://substackcdn.com/image/fetch/$s_!GNNo!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F79b5ade8-c2bd-4415-8718-6063b2ce826c_1318x546.png 848w, https://substackcdn.com/image/fetch/$s_!GNNo!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F79b5ade8-c2bd-4415-8718-6063b2ce826c_1318x546.png 1272w, https://substackcdn.com/image/fetch/$s_!GNNo!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F79b5ade8-c2bd-4415-8718-6063b2ce826c_1318x546.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>In the last couple of years, more and more tools have been added to the <a href="https://cloud.google.com/architecture/mlops-continuous-delivery-and-automation-pipelines-in-machine-learning">MLOps</a> ecosystem. Why is it then that a <a href="https://www.gartner.com/en/newsroom/press-releases/2022-08-22-gartner-survey-reveals-80-percent-of-executives-think-automation-can-be-applied-to-any-business-decision">recent Gartner survey</a> revealed that on average only 54% of AI projects make it from pilot to production? Just one per cent up from the previous year, with all this tooling at our disposal, surely we&#8217;re missing something. Yes, many of our models just don&#8217;t make it to production.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://landscape.lfai.foundation/images/landscape.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!r5VB!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F48b3c1f8-d057-436c-82a9-98a912e82f1d_6360x3720.png 424w, https://substackcdn.com/image/fetch/$s_!r5VB!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F48b3c1f8-d057-436c-82a9-98a912e82f1d_6360x3720.png 848w, https://substackcdn.com/image/fetch/$s_!r5VB!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F48b3c1f8-d057-436c-82a9-98a912e82f1d_6360x3720.png 1272w, https://substackcdn.com/image/fetch/$s_!r5VB!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F48b3c1f8-d057-436c-82a9-98a912e82f1d_6360x3720.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!r5VB!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F48b3c1f8-d057-436c-82a9-98a912e82f1d_6360x3720.png" width="1456" height="852" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/48b3c1f8-d057-436c-82a9-98a912e82f1d_6360x3720.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:false,&quot;imageSize&quot;:&quot;normal&quot;,&quot;height&quot;:852,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:3758841,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:&quot;https://landscape.lfai.foundation/images/landscape.png&quot;,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!r5VB!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F48b3c1f8-d057-436c-82a9-98a912e82f1d_6360x3720.png 424w, https://substackcdn.com/image/fetch/$s_!r5VB!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F48b3c1f8-d057-436c-82a9-98a912e82f1d_6360x3720.png 848w, https://substackcdn.com/image/fetch/$s_!r5VB!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F48b3c1f8-d057-436c-82a9-98a912e82f1d_6360x3720.png 1272w, https://substackcdn.com/image/fetch/$s_!r5VB!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F48b3c1f8-d057-436c-82a9-98a912e82f1d_6360x3720.png 1456w" sizes="100vw"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>One would think that we&#8217;d have this figured out by now. Couldn&#8217;t be anymore further from the truth. As described <a href="https://neptune.ai/blog/mlops-tools-platforms-landscape">here</a> and <a href="https://research.aimultiple.com/mlops-tools/">here</a>, it seems that more tool choices do not translate to increased success in productionizing ML systems. </p><p> So what are we to do then?</p><p><strong>Getting over software complexity</strong></p><p>Coming from a software engineering background, I'm no stranger to the complexities of software development. To address this, the industry employs various strategies, and one notable approach is the use of architectural frameworks. For instance, the <a href="https://www.tutorialspoint.com/mvc_framework/mvc_framework_introduction.htm">Model-View-Controller (MVC) framework</a> gained popularity in the .NET ecosystem around 2010, as it facilitated web developers in creating web applications with a modular and shared understanding of web application development. (As an aside, it&#8217;s not something you would want to use anymore these days  as there have been many advancements since then)</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!81co!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4f2c3092-d13b-4bac-b48f-28fa14d5ea3e_2000x1500.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!81co!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4f2c3092-d13b-4bac-b48f-28fa14d5ea3e_2000x1500.png 424w, https://substackcdn.com/image/fetch/$s_!81co!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4f2c3092-d13b-4bac-b48f-28fa14d5ea3e_2000x1500.png 848w, https://substackcdn.com/image/fetch/$s_!81co!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4f2c3092-d13b-4bac-b48f-28fa14d5ea3e_2000x1500.png 1272w, https://substackcdn.com/image/fetch/$s_!81co!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4f2c3092-d13b-4bac-b48f-28fa14d5ea3e_2000x1500.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!81co!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4f2c3092-d13b-4bac-b48f-28fa14d5ea3e_2000x1500.png" width="1456" height="1092" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/4f2c3092-d13b-4bac-b48f-28fa14d5ea3e_2000x1500.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1092,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:42498,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!81co!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4f2c3092-d13b-4bac-b48f-28fa14d5ea3e_2000x1500.png 424w, https://substackcdn.com/image/fetch/$s_!81co!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4f2c3092-d13b-4bac-b48f-28fa14d5ea3e_2000x1500.png 848w, https://substackcdn.com/image/fetch/$s_!81co!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4f2c3092-d13b-4bac-b48f-28fa14d5ea3e_2000x1500.png 1272w, https://substackcdn.com/image/fetch/$s_!81co!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4f2c3092-d13b-4bac-b48f-28fa14d5ea3e_2000x1500.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Consequently, it garnered widespread adoption within the industry because it made web application development more approachable and manageable. With a common understanding of the system, everyone involved had a better grasp of the process. This, in turn, led to an abundance of developers well-versed in the framework, making it easier and more cost-effective to find collaborators.</p><div><hr></div><p><strong>How about machine learning?</strong></p><p>As I mentioned earlier, the arrival of all these ML and MLOps tools had minimal impact on the ability to productionize ML systems. Well, I dare think that it might have even contributed to decision fatigue, as being confronted by all these choices made the decision even more difficult to make.  </p><p>Can we achieve the same outcome in machine learning as our developer cousins?</p><p>I believe so. The folks at <a href="https://www.hopsworks.ai/">Hopsworks</a> developed the <a href="https://www.hopsworks.ai/post/mlops-to-ml-systems-with-fti-pipelines">FTI architecture for machine learning systems</a>. Instead of the usual single &#8220;end-to-end pipeline&#8220;, where the mental model has everything including the kitchen sink, they have recommended three separate pipelines - the <strong>Feature pipeline</strong>, the <strong>Training pipeline</strong> and the <strong>Inference pipeline</strong>. </p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!kaMc!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7bceba0f-935b-424d-9b0d-c416d6a0b289_772x440.gif" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!kaMc!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7bceba0f-935b-424d-9b0d-c416d6a0b289_772x440.gif 424w, https://substackcdn.com/image/fetch/$s_!kaMc!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7bceba0f-935b-424d-9b0d-c416d6a0b289_772x440.gif 848w, https://substackcdn.com/image/fetch/$s_!kaMc!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7bceba0f-935b-424d-9b0d-c416d6a0b289_772x440.gif 1272w, https://substackcdn.com/image/fetch/$s_!kaMc!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7bceba0f-935b-424d-9b0d-c416d6a0b289_772x440.gif 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!kaMc!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7bceba0f-935b-424d-9b0d-c416d6a0b289_772x440.gif" width="772" height="440" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/7bceba0f-935b-424d-9b0d-c416d6a0b289_772x440.gif&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:440,&quot;width&quot;:772,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:2505825,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/gif&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!kaMc!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7bceba0f-935b-424d-9b0d-c416d6a0b289_772x440.gif 424w, https://substackcdn.com/image/fetch/$s_!kaMc!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7bceba0f-935b-424d-9b0d-c416d6a0b289_772x440.gif 848w, https://substackcdn.com/image/fetch/$s_!kaMc!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7bceba0f-935b-424d-9b0d-c416d6a0b289_772x440.gif 1272w, https://substackcdn.com/image/fetch/$s_!kaMc!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7bceba0f-935b-424d-9b0d-c416d6a0b289_772x440.gif 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Although <a href="https://www.uber.com/en-AU/blog/">Uber Engineering</a> pioneered <a href="https://www.uber.com/en-AU/blog/michelangelo-machine-learning-platform/">using multiple pipelines</a> to manage the scalability and complexity in machine learning systems, it looks like Hopsworks first came up with FTI architecture.</p><p><strong>But how do multiple pipelines help?</strong></p><p>At first glance, having multiple pipelines is counter-intuitive. Wouldn&#8217;t a single pipeline be simpler for our system, and easier to build and maintain? Well, yes it is a simpler system no doubt, however, typical ML systems deal with dynamic data, and we should have a system that supports that from the start.</p><p>A <strong>Feature pipeline</strong> regularly creates &#8216;features&#8216; from incoming data and saves it to a shared storage called a <a href="https://www.featurestore.org/what-is-a-feature-store">feature store</a>. Typical systems have regular (if not streaming) data ingestion pipelines, so there is a need to operate the feature pipeline regularly, on-demand for streaming systems, or on a schedule for batch systems. </p><p>A <strong>Training pipeline</strong> reads training data from the feature store, trains the model and saves the trained model in a model registry. And because this pipeline is stand-alone, it can be 1) developed by someone else, and 2) invoked by a different schedule from the other pipelines. </p><p>And finally, the <strong>Inference pipeline</strong> is responsible for facilitating the prediction service. Call it on a schedule for batch inference systems, or if used for online inference, use any of the model serving infrastructure made available by many varied service providers.</p><div><hr></div><p><strong>Benefits of the FTI architecture</strong> </p><ul><li><p>simple mental model of realistic machine learning systems</p></li><li><p>enables the development of maintainable ML systems in a relatively short period</p></li><li><p>reduced cognitive load when thinking about the whole ML system</p></li><li><p>modular pipelines provide clear boundaries and interfaces and promote collaboration between the multiple personas in an ML product team</p></li><li><p>open architecture - enables it to be cloud and tool-agnostic to avoid lock-in</p></li></ul><p><strong>Where do we go from here</strong></p><p>In the next few editions, let&#8217;s develop a machine learning system from the ground up guided by this architecture. It will be fun, trust me. <br><br>Till then, <br><br>JO</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://www.fullstackml.dev/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading Full Stack ML! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div>]]></content:encoded></item><item><title><![CDATA[Welcome to Full Stack ML!]]></title><description><![CDATA[Software and Machine Learning Systems]]></description><link>https://www.fullstackml.dev/p/welcome-to-full-stack-ml</link><guid isPermaLink="false">https://www.fullstackml.dev/p/welcome-to-full-stack-ml</guid><dc:creator><![CDATA[JO Reyes]]></dc:creator><pubDate>Mon, 09 Oct 2023 22:00:23 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F79438e85-8316-400a-8950-ca3e9c0e12cc_5184x3456.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Hi everyone!<br><br>Welcome to <a href="http://fullstackml.dev/">Full Stack ML</a>!</p><p>I&#8217;m JO Reyes.</p><p>Originally from Cebu Philippines, I now call Sydney Australia home.  </p><p>I&#8217;m an experienced software engineer and have been skilling up on machine learning systems for the past couple of years. I&#8217;m also midway through my Masters of Data Science at UNSW which is going well.</p><div><hr></div><p>I love building things, and writing about it. I have a blog at <a href="https://fullstackdeveloper.tips/">Full Stack Developer Tips</a> which I still regularly update, however <a href="http://fullstackml.dev/">Full Stack ML</a> is for all my machine learning content. <br><br>With <a href="http://fullstackml.dev">Full Stack ML</a>, I hope to connect with many of you ML and AI folks, and learn from each other. I plan to cover topics related to building machine learning systems, specially as it makes its way into production.</p><div><hr></div><p>Anyhow, let&#8217;s keep this short today, as this is just a welcome message! The real newsletter will start next week.</p><p>Thanks for stopping by, and see you later!</p><p>Regards, </p><p>JO</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://www.fullstackml.dev/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading <a href="http://fullstackml.dev/">Full Stack ML</a>! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div>]]></content:encoded></item></channel></rss>