What This Is

Step 10 unlocks AI beyond the keyboard. You can talk to AI like a colleague, show it what you're looking at, share your screen for real-time help, and dictate text at 3x typing speed. This isn't a gimmick—it's a fundamental shift in how you interact. The right input mode for the right moment transforms how fast you work.

Why It Matters

This is likely to be a huge unlock—and it feels uncomfortable at first. But push through. Here's the real breakthrough: you can speak in a stream of consciousness and AI organizes your thoughts for you. No more crafting emails word-by-word. Talk at your brain's pace, let AI structure it. This is where AI stops being "software" and becomes something different—an assistant embedded in your life in ways computers never were.

Tools

Whispr Flow system-wide dictation at 3x typing speed | ChatGPT app voice + live video on mobile | Gemini voice + live video on mobile | Google AI Studio desktop screen sharing

The Four Modes

1. Dictation Whispr Flow
Speak and your words appear as text—anywhere. Whispr Flow works system-wide: emails, documents, Slack, browser forms, any text field. Hold a hotkey, speak naturally (including punctuation), release. Text appears instantly. This is pure speed: 3x faster than typing for most people.
Best for: Long emails, meeting notes, document drafts, any text where you know what to say
2. Voice Conversation ChatGPT Voice, Gemini Live
Have a real back-and-forth conversation with AI. Unlike dictation (which just transcribes), voice conversation is interactive—AI listens, responds, and you can interrupt naturally. Think of it as calling a smart colleague. Works while walking, driving, cooking, exercising.
Best for: Brainstorming, thinking out loud, hands-free work, exploring ideas conversationally
3. Vision (Live Video) ChatGPT app, Gemini mobile
Turn on your phone camera while talking to AI—it sees what you're seeing in real-time and can guide you through tasks. Fixing a broken sink? Cooking a new recipe? Organizing a space? AI watches and talks you through it, hands-free. This is conversational help with vision, not uploading photos.
Best for: Repairs, cooking, DIY projects, real-time guidance for physical tasks
4. Screen Sharing Google AI Studio (desktop)
Share your computer screen and let AI see what you see in real-time. AI watches as you navigate software, can see your cursor, and provides live guidance. Like having an expert looking over your shoulder while working on your computer.
Best for: Learning new software, debugging, Excel formulas, getting unstuck on computer tasks

When to Use Each Mode

Writing Faster
Use Dictation. Long email? Dictate it in 2 minutes instead of 15. Meeting notes? Speak them as you go. First drafts of any document.
Thinking Through Problems
Use Voice Conversation. Walk and talk through a decision. Brainstorm while commuting. Process ideas without typing.
Getting Real-Time Guidance
Use Vision (Live Video). Turn on Gemini's camera while fixing something. "I'm looking at this broken faucet, what should I try?" AI sees it and guides you.
Getting Unstuck in Software
Use Screen Sharing. Share your screen with Gemini Live. "Help me figure out how to do X." Get real-time guidance.

Key Distinction: Dictation vs Voice Conversation

Dictation

What it does: Converts speech to text

AI involvement: None—just transcription

Output: Text appears where your cursor is

Tool: Whisp Flow

Use when: You know what to say and want it typed fast

vs
Voice Conversation

What it does: Two-way dialogue with AI

AI involvement: Full—AI thinks, responds, debates

Output: AI speaks back to you

Tool: ChatGPT Voice, Gemini Live

Use when: You want to think out loud or need AI input

Start Here: Install Whispr Flow Today
Whispr Flow has the highest immediate ROI. Download from whispr.flow → Set a hotkey (Ctrl+Space works well) → Dictate your next long email instead of typing it. You'll feel the speed difference immediately. Once you've experienced 3x faster text entry, you'll never go back.

How To Do It

Dictation with Whispr Flow whispr.flow | Works on Mac & Windows
1. Download and install Whispr Flow from whispr.flow
2. Set your hotkey in preferences (Ctrl+Space or Option+Space)
3. Click any text field, hold the hotkey, and speak naturally
4. Release the hotkey—your text appears instantly
Speak punctuation naturally: "period" "comma" "new paragraph" "question mark"
Works everywhere: Slack, Gmail, Google Docs, browser forms, any app
Ideal for: emails over 100 words, meeting notes, first drafts, documentation
Speed gain: Most people dictate 150 WPM vs typing 40 WPM = 3-4x faster
Voice Conversation ChatGPT Voice (mobile + desktop) | Gemini Live (mobile)
1. Open ChatGPT app and tap the headphone icon (or use Gemini Live)
2. Start talking naturally—no need to be formal or precise
3. Interrupt anytime by just speaking—AI will stop and listen
4. Have a real conversation: explore, clarify, go deeper
Great conversation starter: "Help me think through a decision I'm facing..."
Works hands-free: while walking, driving, cooking, exercising
Gemini Live has longer memory and better at sustained conversation
Switch to text when you need precise output or complex formatting
Vision Mode (Live Video) ChatGPT app | Gemini mobile — AI sees what you're seeing in real-time
1. Open ChatGPT or Gemini app on your phone and start a voice conversation
2. Turn on video mode—your camera activates and AI can see what you're looking at
3. Talk naturally: "I'm looking at this broken sink. What should I try?" AI sees it and responds
4. Have a back-and-forth conversation while AI watches—it's like having an expert with you
This is conversational vision—AI sees what you see and talks you through it, hands-free
Both ChatGPT app and Gemini support live video on mobile
Perfect for: fixing things, cooking new recipes, organizing spaces, DIY projects
Clear lighting and steady camera angle get better results
Screen Sharing Google AI Studio (desktop) | Gemini (mobile)
1. Open Google AI Studio on desktop or Gemini on mobile
2. Enable screen sharing—AI Studio sees your desktop, Gemini sees your phone screen
3. Ask for help: "Watch my screen and help me figure out how to..."
4. Navigate as AI guides you in real-time
Google AI Studio shares your computer screen, Gemini shares your phone screen
Perfect for: Excel formulas, unfamiliar software, debugging, mobile app tutorials
Privacy: Don't share screens with sensitive data visible
Both provide real-time guidance while you work

Real-World Examples

10x Email Speed

Used Whispr Flow to dictate a 500-word project update in 2 minutes. Would have taken 15+ minutes typing.

Walking Brainstorm

30-minute walk with ChatGPT Voice. Talked through a strategic decision. Arrived with clarity and a plan.

Menu Translation

In Japan, pointed camera at menu. Gemini translated everything and flagged dishes with my allergens.

Excel Debugging

Stuck on a complex formula. Shared screen with Gemini Live. It spotted the error and walked me through the fix.

Contract Analysis

Photographed a 10-page contract. Uploaded to Claude. "What are the key terms I should negotiate?"

Hands-Free Cooking

Cooking with flour-covered hands. "Hey ChatGPT, convert 180 celsius to fahrenheit." No touching required.

Which Tool Supports What?

Tool Dictation Voice Chat Live Video Screen Share
Whispr Flow Best-in-class
ChatGPT app Yes (mobile+desktop) Yes — mobile only
Gemini Yes (mobile) Yes — mobile only Mobile screen share
Google AI Studio Desktop only

Making It Stick: Build the Habits

Week 1
Install Whispr Flow. Dictate every email over 50 words. Notice the time saved.
Week 2
Try one voice conversation during a walk or commute. Think out loud about a real decision.
Week 3
Upload an image to ChatGPT or Claude. Photo a document, menu, or whiteboard. Ask questions.
Week 4
Try screen sharing when stuck in software. Let Gemini Live guide you through something new.
Key Insight

"Multi-modal AI is where your relationship with AI fundamentally changes. Talking to your computer feels strange at first. But push through—AI stops being a tool you visit and becomes an assistant that's everywhere. The keyboard was the barrier. Now it's gone."

PDF