Kae Learns in Public

Speech to Code: Spec-Driven Development with Wispr Flow and Claude

Outline of presentation with lightly edited additional notes dictated through Wispr Flow

Speech to Code

Okay, so I'm going to try something a little bit different here because I want to demo the technology that I'm talking about.

And this is an experiment. I've rehearsed this several times in one of the phone rooms, but I haven't ever done this live.

My name's Kae Sluder. I am a recently drafted front-end developer. I'm a new career developer. My background is in e-learning design. I switched to full-time software development about five years ago, and I have been loving it ever since.


Problem

So the big problem: I'm just going to raise my hand so everybody can see. I broke my hand just after I was asked to move here full-time in October, and one-handed typing has a number of pain points for a programmer. The big ones are:


Speech Recognition

So there are a variety of technologies that I could have used to work around this. In the past, this is unfortunately not by first-hand injury, so I've leaned towards speech recognition. Currently, in 2026, there are kind of like two things for voice coding. Tail on Voice is a semi-open source speech engine.

Talon is entirely programmable in Python, so it offers very high customization, but the downside of that is you have to do a lot of customization to get it working with your flow.

A newer product, Wispr Flow, is what I'm using here. It is especially designed for dictating English text to be fed into an AI prompt, as well as email and other things that we have to communicate in.


English to Code

So after about a month and a half or so of trying to dictate JavaScript anyway, I started moving towards a spec-driven development, AI-supported system. The basic premise there is that you write your algorithms or you write your specifications in plain English, and then you use an AI agent to translate that into JavaScript or TypeScript or any other language that you're working in.


Low level: Pseudocode Algorithm Description

// Pseudocode example one 
Loop for i from 1 to 30, including 30 
If i is evenly divisible by 15, print "fizzbuzz" 
Else If i is evenly divisible by 5. Print "buzz"
Else if i is evenly divisible by three, print fizz. 
Else print i 
Exit the loop. 

This is an example of a very low-level approach to this, which I've tried out and use now and then when I can't get the AI to go down to this level. This is pseudocode; other people might call this semi-literate programming, although that's a complicated term. You describe what you want for the computer to do in English, and then you ask the AI to translate it. This is an example which I dictated myself last week. It's very simple because I only wanted it to be on one screen.


High-level: spec/behavior-based development

  1. Spec file
  2. Plan file
  3. Question & review cycle
  4. Execute plan
  5. Review

See also: BMAD, github spec-kit

So now we're getting to the point where everybody is talking about these days: spec-driven development, behavior-based development.

The basic idea is that you start with a spec file, which is a description of what you want the feature to do. You want to focus on just the what and the when, and not the how. Then you expand that into a plan file, which breaks the development of that down into specific steps. You work through both within a question and review cycle to refine what you're talking about, execute the plan, and then do review of the resulting code.


Spec File

Detailed description of what exactly you want for the AI to create.

So a spec file is a detailed description of exactly what you want for the AI to create. Now, I said earlier to focus on just a what, and I'm going to immediately break that rule. Some of the things that I've experimented with, and there are a lot of frameworks that will give you very specifically formatted spec files for this. This has been like drinking from a fire hose lately.

But some of the things that you want to put in a spec file are:


Behavior

When, given, then = Action, condition, result.

So this is where my educational background comes in handy. Behavior-driven specs are very:

  1. They have to be observable.
  2. What you want to do is describe the feature you're creating in terms of an action, condition, and result.

There have been a variety of ways to define this over the last hundred years or so that this has been used in education, and then this has been adapted or maybe reinvented within the software development world.

When the user clicks on the Next button 
given that the user has filled in all required fields 
then the user advances to step two of the form 

This is a Gherkin language behavior description. When the user clicks on the next button, given that the user has filled in all of the required fields, then the user advances to step two of the form.

When the user enters more than 50 characters into the `First Name` field
Then the user sees an error message

Plan Files

After the spec file is created, the next step is to let me do a quick time check. After the spec file is created, then you want it to create a plan file, which is a detailed, step-by-step instruction. Almost always I use the Claude plan mode to do this. It is very good at breaking down a complex feature set or complex development task into a series of steps that it can understand. Along the way, it will ask you questions about things that you haven't fully clarified in the spec file, and this is an important part of the process.


Document Driven Process

Spec and plan files are living documents. They can be revised and reviewed multiple times before code generation.

I want to emphasize that this is a document-driven process, and there has been kind of a debate going on between "the vibe coders who kind of pants it using the AI prompt window" and the spec-driven developers.

And the document-driven process provides a number of advantages:

It is also helpful in terms of managing AI performance. There is some, and a lot of this is anecdotal wisdom, so AIs tend to perform better and hallucinate less when you keep the context down to about 50% of the max context size. That's sort of a conventional but unproven wisdom that's been flying around.

The other thing you want to avoid is context poisoning, which is a case where you say, "Hey, let's think about using this library A," and then you decide, "No, that's not going to work; let's switch to library B." The mentions of library A and the discussion of library A are going to stay in that memory until you clear it, and that can bias the results. Also, another problem you want to avoid is fixation, where the AI will do analysis and think, "Okay, I have absolutely found what the problem is here; it is this." Sometimes the AI is wrong, but getting the AI to shift away from this and focusing on that is easy. It is easier to just say, "Okay, I'm going to wipe your memory, and we'll just start this discussion all over again."

What you can do with an AI, you can't do with people


Generate Code and Tests

Tests are a mandatory part of the process because generative AI is developed for pattern matching. It will think through the problem as best it can, but it can't give you a definite yes, this is correct, no, it's not correct answer. At some point in your specs, you want to say, "Okay, here, this is a hard, this is a required requirement. This is the command you want to run in order to get this information." If the test doesn't pass, you need to go back through it and fix the code.


Human Review

So, human review is also an important part of the process. Sometimes the AI is wrong. Often, what the AI gives you could be better. There's the disclaimer: always double check what the AI gives you, and that is especially true before you commit to bringing code into your main codebase.

My practice: I try not to generate more code than I can comfortably review in a single session or maybe the next day. Also, as someone who is still learning a lot of this stuff, making the transition from a hobby programmer to a full-time programmer, if I can't explain how it works, if I don't understand how it works, I need to look into learning how it works because I am going to be on task to fixing it if it breaks.


Caveats

Some of the caveats so far are that it's not necessarily faster. Some people have reported it's very fast. Some people have reported it's not fast. Some people have reported it's fast, but code quality is significantly compromised. These are issues that we need to work out.

I've been leaning on this a lot because broken hand, but it is something that I am not choosing to go into blindly. This needs to be evaluated, and I need to take care in terms of how and when to use this intentionally.

Okay, another thing that's been found is that developers don't do faster work, but they do more detailed work. The focus shifts to architecture review, design considerations, and more complete testing. That is something that we need to think about and manage.

There are a couple of papers that come from Anthropic. Incidentally, there is a comprehension risk that if people who rely on an AI to give them code without looking at the documentation, as opposed to using the code, use the AI to intend to interrogate what the library is doing and how the library is designed, they come out of the experiment with lower comprehension of what was created. Again, intentionality very much matters in terms of how you are using the tool.