I shipped a Chrome extension this week that I've been wanting for a while: right-click any social video on a page and get a clean transcript back, without leaving the tab.
Why I built this
I write content about short-form video and spend a lot of time pulling hooks, scripts, and outlines out of TikToks, Reels, and Shorts. The flow used to be:
- Copy URL
- Open another transcription tool
- Paste, wait, download
- Tab back to where I was
That's a lot of context switches for "I just want the words from this video." A right-click context menu felt like the right interface.
What it does
Voqusa Chrome Extension is the built version. You right-click any web page (or paste a video URL into the popup) and a transcript appears in 30–60 seconds.
- Supports TikTok, YouTube Shorts, Instagram Reels, Facebook, Twitter/X, LinkedIn, Pinterest
- Whisper-grade speech-to-text on the backend
- Anonymous users get 3 free transcripts; sign in at voqusa.com to use the full free tier or pay-as-you-go credits
- 133 KiB install, no tracking pixels
Architecture
The extension itself is tiny — it just hands the video URL to the cloud transcription API. The constraints I cared about:
- Privacy: the extension only sends the URL you explicitly submit. No DOM, no cookies, no browsing history.
- Anonymous-first: local device ID stores transcripts so you can try it without signing in.
- Fast: Whisper-grade model, downloaded audio only (not full video).
Use cases I'm using it for
- Pulling hooks and outlines from competitor short-form videos
- Turning lecture clips into citable text
- Building swipe files of high-performing scripts
- Reading along when audio isn't an option (open offices, late nights)
If you find yourself transcribing social videos often, give it a try and let me know what's missing.













