What is TAWK and why was it built?

TAWK is a voice-to-text Mac app that runs entirely on your local machine using OpenAI's Whisper model. It was built because existing voice-to-text tools either required cloud processing (raising privacy concerns), charged monthly subscriptions, or were unreliable. TAWK costs $29 one-time, works offline, and supports 90+ languages.

Why does TAWK work offline instead of using cloud APIs?

TAWK processes all speech locally using OpenAI's Whisper model because voice data is inherently private. People dictate emails, journal entries, business notes, and personal messages. Sending that audio to a server creates unnecessary privacy risk. Offline processing also means zero network latency, no API costs, and the app works without an internet connection.

Why is TAWK a one-time purchase instead of a subscription?

TAWK runs entirely on the user's hardware with no cloud infrastructure, so there are no ongoing server costs. Charging a monthly subscription for software that uses zero cloud resources felt dishonest. The $29 one-time price means customers own the product forever, which aligns with the philosophy that good tools should respect both your privacy and your wallet.

How accurate is TAWK compared to cloud-based dictation?

TAWK uses OpenAI's Whisper small model, which provides excellent accuracy for everyday dictation in 90+ languages. While some cloud services may have marginal accuracy advantages for edge cases, Whisper's accuracy is more than sufficient for emails, messages, notes, and documents. The trade-off of slightly smaller model size for complete privacy and zero latency is well worth it for most users.

Why I Built a $29 Voice-to-Text App That Never Touches the Cloud

I spend most of my day job at Mindvalley thinking about scale. Millions of users, hundreds of programs, multi-million dollar ad budgets, complex funnels. But the product I am most proud of building is a tiny Mac app that costs $29 and does exactly one thing: it turns your voice into text without ever sending a single byte to the internet.

The app is called TAWK. You press a keyboard shortcut, talk, and your words appear wherever your cursor is. That is it. No account. No cloud. No subscription. It runs entirely on your machine using OpenAI's Whisper model, and when you close it, there is no trace of what you said anywhere except on your own hard drive.

I want to explain why I built it this way, because the decisions behind TAWK reflect something I think about a lot: what it actually means to build software you believe in.

The Problem Was Personal

I type fast, but I think faster. In any given day, I am writing Slack messages, emails, strategy documents, meeting notes, creative briefs, and ad copy. My thoughts come in paragraphs, but my fingers produce them one word at a time. The friction between thinking and typing is real, and it compounds over a full workday.

I tried every voice-to-text solution I could find. Apple's built-in dictation was unreliable and required an internet connection. The polished third-party options wanted $10 or $15 a month. Some were accurate but introduced noticeable latency because they processed audio on remote servers. Others stored recordings in the cloud for "quality improvement," which is a polite way of saying they keep your voice data.

None of them felt right. I wanted something that worked instantly, processed everything locally, and did not treat my voice as a data asset. When I could not find it, I decided to build it. That weekend project became TAWK.

Why Offline Processing Is a Moral Choice

There is a pragmatic argument for offline AI: lower latency, no internet dependency, zero server costs. All true. But the real reason I built TAWK to run locally is simpler than that. It is about respect.

Think about what people dictate. Emails to their spouse. Messages to their therapist. Business ideas they have not shared with anyone. Journal entries. Draft texts they delete before sending. Voice-to-text captures raw, unfiltered human thought. Sending that to a server is an act of extraordinary trust, and most voice-to-text apps do not deserve that trust.

When I designed TAWK, I made a rule: no audio data leaves the machine. Not for processing, not for analytics, not for model improvement. Your Mac's processor does all the work using OpenAI's Whisper model running locally. When the transcription is done, the audio buffer is discarded. There is no recording saved. There is no server to hack. There is nothing to subpoena.

Privacy is not a feature. It is a design constraint that shapes every other decision you make.

This decision cascaded through the entire product. No cloud means no user accounts. No user accounts means no onboarding friction. No onboarding friction means someone can go from downloading TAWK to dictating their first sentence in under a minute. Privacy made the product simpler, and simpler products win.

The $29 Question

The SaaS model is deeply ingrained in software culture. Recurring revenue, higher lifetime value, investor-friendly metrics. I understand why founders default to subscriptions. But I could not bring myself to charge monthly for something that uses zero cloud resources.

TAWK runs on your hardware. It uses your CPU. It stores nothing on my servers because I do not have servers. The only ongoing cost on my end is hosting a static website. Charging someone $10 a month for an app that lives entirely on their machine felt dishonest. So I set the price at $29 one-time and never looked back.

There is a deeper philosophy here. I think the subscription model has broken something in software. When every tool charges monthly, the cumulative cost of your software stack becomes absurd. $10 here, $15 there, $20 for something you use twice a week. By the end of the month, you are paying hundreds of dollars for tools that mostly sit idle. We have normalized renting software that runs on our own computers.

TAWK is a small protest against that. You pay once. You own it. It works forever. If I release a major update, you get it. There is no annual upsell. There is no "Pro tier." There is just the app, doing the one thing it was built to do.

Some people have told me I am leaving money on the table. They are probably right. But I sleep well knowing that every person who buys TAWK got a fair deal. That matters more to me than optimizing lifetime value.

Building with Whisper: The Technical Bet

When I started building TAWK, I had to choose a speech recognition engine. The options were essentially: use a cloud API (Google, AWS, Deepgram) or use a local model. I chose OpenAI's Whisper, specifically the "small" model, and I have not regretted it once.

Whisper's small model is roughly 460MB. It loads into memory when TAWK starts and stays there. When you press the shortcut and speak, the audio is processed against this local model in under a second on any Apple Silicon Mac. The accuracy is genuinely excellent for everyday dictation. It handles accents well. It understands context. It supports over 90 languages.

Could I get marginally better accuracy from a cloud API with a larger model? Probably, for certain edge cases. But the trade-off is not close. With Whisper running locally, TAWK has zero network latency, works on airplanes, works in cafes with terrible WiFi, and never sends your voice anywhere. That is a better product for 99% of use cases.

I wrote about the detailed comparison between TAWK and other voice-to-text tools on the TAWK blog if you want the full breakdown. The short version: cloud-based tools are not meaningfully better at transcription, but they are meaningfully worse at privacy.

What I Learned About Building Products You Believe In

My day job involves building products at massive scale. At Mindvalley, I think about growth funnels, conversion rates, retention metrics, and advertising efficiency. I love that work. But building TAWK taught me something that operating at scale can obscure: the best products are built from conviction, not from market analysis.

I did not build TAWK because a market gap analysis told me to. I built it because I wanted it to exist. Every design decision was filtered through a simple question: would I want to use this? Would I feel good recommending this to a friend? Would I be comfortable if someone I respected saw how this works under the hood?

That filter eliminated a lot of "smart" product decisions. I could have added telemetry to understand usage patterns. I did not, because I would not want an app monitoring how I use it. I could have required an email address for the download. I did not, because I hate giving my email to download software. I could have added a freemium tier with a daily usage limit. I did not, because artificial limitations are annoying and I would not tolerate them myself.

Build the product you would want to buy. Then sell it to people like you.

The Compounding Value of Simplicity

TAWK does one thing. Press a key, speak, see text. I get feature requests every week. People want live transcription of meetings. They want custom hotwords. They want integration with Notion, Obsidian, and every other note-taking app. They want voice commands. They want speaker identification.

I say no to almost everything. Not because those are bad ideas, but because complexity is the enemy of reliability. Every feature you add is a feature that can break. Every integration is a dependency that can change. Every option is cognitive load for the user. TAWK's power is that it does one thing and does it flawlessly. You never have to think about it. You never have to configure it. You just talk and it works.

This is a lesson I have carried back to my work at Mindvalley. The most successful products we have built are the ones with the tightest scope. Not the ones with the most features, but the ones that nail a single experience so well that users tell their friends about it unprompted.

Who Actually Uses TAWK

I built TAWK for myself, but the people who use it are a much more interesting group than I expected. Writers who draft entire articles by dictating while they pace around their office. Developers who use it for writing documentation and commit messages. People with RSI who cannot type for extended periods. Multilingual professionals who switch between languages throughout the day.

The use case I did not anticipate was people using TAWK for journaling. Something about speaking your thoughts into a text field feels different from typing them. There is a rawness to it. You do not self-edit as much when you are talking. Several users have told me TAWK changed their journaling practice because the friction of typing was the thing stopping them from doing it consistently.

That is the kind of feedback that does not show up in a product analytics dashboard. It shows up in a one-line email from someone who took the time to write to you. Those emails are worth more than any metric.

The Broader Case for Tools That Respect You

TAWK is a small product. It is a menu bar app that costs $29. It is not going to change the world. But I think it represents something worth advocating for: software that respects the person using it.

Respect means not harvesting data you do not need. Respect means charging a fair price once instead of extracting maximum revenue over time. Respect means building something that works reliably and then getting out of the way. Respect means saying "your voice data never leaves your machine" and actually meaning it, not burying exceptions in a privacy policy.

The best voice-to-text tools are the ones that disappear into your workflow. You should not be thinking about your dictation software. You should be thinking about what you are saying. TAWK was designed to be invisible. It lives in your menu bar, it responds to a shortcut, and it types what you said. Then it goes back to being invisible.

I think we are entering an era where people are going to care more about where their data goes. The AI boom has made everyone aware that their inputs are training data for someone else's model. Voice data is especially sensitive. People are starting to ask: does this app work on my device, or does it send my voice somewhere? That question is going to shape the next generation of software products.

What Comes Next

TAWK is not done. I use it every single day, which means I find new things to improve constantly. Whisper models keep getting better. Apple Silicon keeps getting faster. The gap between local and cloud processing is narrowing, not widening.

But the core principles are not going to change. Offline processing. One-time pricing. No accounts. No telemetry. No voice data leaving your machine. These are not features that I might remove in a future version. They are the foundation that everything else is built on.

If any of this resonates with you, try TAWK. It is $29. It works on any Mac running macOS 12 or later. You can be dictating in under a minute. And if you are a builder thinking about your next project, I hope this post encourages you to make the decisions you actually believe in, even when they are not the ones that optimize for revenue.

The products that last are the ones built with conviction. Everything else is just software.