SKY Labs Experiment Format
This is an observational study of early user behavior patterns on a growing AI platform. Focus on qualitative insights rather than statistical conclusions.
Objective
To understand how early users interact with different AI voice options and identify patterns in:
- Initial voice preference selection patterns
- Retention of chosen voice settings across sessions
- Qualitative feedback on voice characteristics
- Usage patterns by content type
- User-reported "comfort" factors with different voices
Hypothesis: Users will show distinct preference patterns based on content type and personal comfort, not just voice quality metrics.
⚙️ Setup
Voice Options
Male & Female AI voices with identical technical specs
Observation Period
November 2024 (4 weeks)
User Group
Early SKY TTS adopters (first 500 users)
Data Collection
Usage logs + qualitative feedback analysis
Voice Characteristics Compared:
Male AI Voice
Deep, authoritative tone with steady pacing. Technical specs identical to female voice.
Female AI Voice
Softer, clearer tone with identical technical parameters. No quality difference in metrics.
📊 User Behavior Observations
Voice Preference by Content Type
Clear Preference Patterns
Content-type bias emerged immediately: Users weren't selecting voices randomly—they matched voice to content purpose.
- Educational/Technical: Strong male voice preference (68-73%)
- Narrative/Casual: Clear female voice preference (59-62%)
- Retention difference: Female voice users stuck with their choice more often (82% vs 71%)
- Trial behavior: 42% of users tried both voices before settling
Qualitative Feedback Analysis
Users provided unsolicited feedback about their voice choices:
Pattern: Users described male voice as "authoritative/professional" and female voice as "engaging/comfortable"—despite identical technical quality.
Unexpected Findings
- Strong gender stereotypes: Users applied traditional gender roles to AI voices despite knowing they're artificial
- Consistency across demographics: Similar patterns observed across age groups and regions
- Emotional attribution: Users attributed human-like characteristics to completely synthetic voices
- Default bias: First-time users overwhelmingly selected male voice first (73%)
Typical User Journey
Most common path observed among early adopters:
Initial Default Selection
73% start with male voice (system default). Only 12% change immediately.
Content-Type Realization
After 2-3 uses, 42% experiment with other voice based on content being processed.
Settling Phase
Users develop content-specific preferences (educational=male, stories=female).
Retention Pattern
Female voice users show higher retention (82%) vs male voice users (71%).
What Didn't Work
Assuming users would select voices based on technical quality metrics was completely wrong.
Specific incorrect assumptions:
- Technical equality myth: Identical technical specs ≠ equal user perception
- Random distribution: Expected ~50/50 split, got strong content-based patterns
- Default neutrality: Default male voice created immediate bias
- Feature-based selection: Users didn't care about technical parameters at all
Key Learning
Users apply human social patterns to AI voices regardless of technical reality. Voice preference is emotionally driven, not feature-driven.
Data-backed insights:
- Voice preference is content-contextual, not universal
- Users want different voices for different purposes
- Retention differs by voice choice (female > male)
- Default selection creates lasting bias
Action Taken
Based on observed behavior patterns:
- Removed default voice setting - Now asks users to choose immediately
- Added content-type suggestions - "Educational content often works better with male voice"
- Implemented voice presets - Quick switches between "Learning Mode" and "Story Mode"
- Enhanced trial experience - Easy A/B comparison before commitment
- Tracked retention by voice - Now monitoring long-term engagement differences
Result after implementation: Female voice adoption increased to 44% (from 36%) and overall user retention improved by 18%.
✅ Conclusion
This observational study revealed human social patterns in AI voice interaction:
Validated Intuition
Users don't treat AI voices as neutral technology. They apply human social patterns and stereotypes even when they know it's artificial intelligence.
Practical Insight
Voice selection should be content-aware: Offer different defaults based on content type rather than one-size-fits-all approach.
Ethical Consideration
Platforms should be aware they're reinforcing gender stereotypes through default voice settings and should consider more nuanced approaches.
Observation Note: All patterns observed during November 2024 with early SKY TTS users. As the platform grows, these patterns may evolve. Current implementation already reflects these learnings.