Voice Inspiration vs. Fantasy Modelling in Gender-Affirming Voice Training
- SpeechAppeal

- Mar 5
- 7 min read
Can My Voice Sound Like This?
When people begin gender-affirming voice training, a common question arises: “Can my voice sound like this?” Many bring examples or links of voices they admire, hoping to see whether their own voice might eventually resemble a particular speaker. This curiosity is natural, and voice references can be motivating when used intentionally.
Having examples saved or bookmarked can be helpful, but over-focusing on “finding voices you like,” especially as early home practice, can sometimes create unhelpful expectations. It’s important to remember that the voices we hear online often reflect years of development. Speakers may have trained extensively, refined their voice through real-life communication, explored coaching or surgical interventions, or simply have a background in singing or performance.
Much of what we hear online is a finished product. Comparing your early attempts to that polished result can create pressure, unrealistic timelines, or the sense that something is wrong if your voice doesn’t immediately match.
There is also a deeper reason to approach modelling with care. Voices are individual—they reflect physiology, personality, communication style, and lived experience. Just like fingerprints, no two voices are identical. Focusing too early on replicating someone else’s sound can pull attention away from discovering what your voice is capable of doing.
All of this said, modelling can be a powerful tool when used intentionally. In gender-affirming voice training, learning how to use voice models thoughtfully is often part of the exploration process. Like any tool, its effectiveness depends on timing, purpose, and framing. Let’s explore.
It’s important to remember that there’s so much that goes into voice. Often, affirming voice can be reduced to pitch, but true vocal development is much more complex and holistic than that. For further reading, check out "More Than Pitch in Gender Affirming Voice".
Fantasy Modelling vs. Voice Inspiration
Early gender-affirming voice training focuses on building capacity: developing a healthy base, pitch flexibility, resonance awareness, breath coordination, articulatory precision, and prosodic control while training the nervous system to produce patterns that may not yet feel automatic. Research in adult motor learning shows that early skill acquisition is most effective when learners focus on process rather than outcome[5]. Outcome-focused attention can increase self-criticism and reduce learning efficiency.
When someone else’s outcome becomes the standard you are measuring yourself against, this is what I refer to as fantasy modelling. Perhaps they were born using that voice, or maybe they have spent years developing it. What you are hearing could reflect years of practice beyond where you are right now. Without structured guidance, this type of focus can become discouraging and pull attention away from the foundational skills that support growth and progress.
Voice inspiration, on the other hand, uses modelling intentionally. Instead of trying to replicate someone else’s voice, the goal is to explore specific vocal features. Try them on and notice how they feel. The focus shifts toward observing and experimenting with characteristics such as rhythm, pitch contour, phrase shape, or resonance balance. This approach aligns with research on imitation in singing pedagogy, which shows that imitation of targeted vocal features strengthens auditory feedback loops and engages sensorimotor regions associated with vocal control[1,6]. In other words, intentional imitation is not simply copying. It is building a sensory-motor map for vocal control.
As you work with voice models, it can be helpful to pause and reflect: Am I using this model intentionally, to explore and learn, or am I measuring myself against it in a way that creates pressure or discouragement? This simple self-check encourages awareness and helps make sure that modeling supports your learning, rather than undermining it.
Finding a Voice That Serves and Suits You
A core principle I share with clients is this: the goal is to find a voice that serves and suits you, not necessarily one you love the sound of. Aesthetic preference alone usually does not reflect all the ways voice carries you, your personality, physiology, and communication style.
For example, I personally love a deep, full, femme-fatale kind of voice, and I use it when it makes sense contextually. But more often, my true communication style is animated and energetic (think: flying chipmunk). A lilting, bright, higher-pitched timbre fits my personality and expressive needs, and it is healthy and sustainable. A voice aligned with your physiology, personality, and social context is easier to use consistently and less likely to create tension than one chosen purely for aesthetic admiration.
Voices Change With Context and Emotion
I mentioned using my femme-fatale colour when it makes sense contextually. What does that mean? My voice is flexible enough to allow those colours to come through when I want them to.
Voices are dynamic. They shift with context, energy, and emotion.
If this sounds interesting to you, check out “Using Your Voice to Calm Your Nervous System”, which highlights the link between vocal behaviour and nervous system regulation.
My animated, high-pitched timbre is less vibrant when I am low-energy, frustrated, or just not feeling it, and that variability is completely normal and healthy.
For some gender-diverse and gender-queer people, voice can also shift with identity. Depending on context, social setting, or personal comfort, someone might express a more feminine, masculine, or androgynous voice at different times. This flexibility is a normal part of exploring and inhabiting your voice, and it highlights that voice is not fixed—it moves with who you are and how you feel.
I encourage clients to explore a wide palette of vocal colours through play and creative activities. Voice can shift in many contexts, whether gaming, role-playing, or recording audiobooks. These playful, low-pressure environments give learners a safe space to experiment with pitch, resonance, phrasing, and expressive qualities.
Hot tip: If you use multiple languages, try experimenting with voice models in each language. This can help you capture the natural flow, rhythm, and phrasing of different languages while expanding your vocal flexibility and expressiveness.
Purposeful modelling can complement this exploration. Trying on someone’s vocal qualities, style, or phrasing can be part of play. It helps clients notice how specific features feel and how they might carry them into everyday communication. Motor learning research supports this approach. Experimenting with variability in different contexts builds adaptability and robustness in skill performance[4].

A Client Example
What does this look like in practice? A recent client example illustrates how intentional modelling can support exploration without creating pressure to replicate someone else’s voice.
I recently began working with a client who felt limited in how expressive their voice could be. Their voice was functional, but it felt flat and restricted. They described it as not fully reflecting their personality and sometimes even disingenuous. In conversation, they struggled to vary rhythm, emphasis, pitch, or vocal colour in ways that felt authentic.
At one point in their training, they brought in a voice model they admired. Rather than trying to replicate the speaker’s identity, we used the model as a lens. We asked: What is happening rhythmically? Where do phrases lift or taper? How does the speaker shift energy across a sentence?
They practised along with short clips, isolating specific features instead of copying the whole voice. This experimentation began to unlock new vocal colours. Their voice started to feel less like a fixed setting and more like a palette.
From there, they explored these qualities in creative, low-pressure contexts outside sessions, such as role-playing games, recording narrative passages, and improvising character voices. These environments provided space to exaggerate, experiment, and notice what felt natural.
Only after this phase did we intentionally transfer elements into everyday conversation. At first, the new expressiveness felt unfamiliar, which is expected. Any motor pattern that differs from long-established habits can initially feel artificial, even when it aligns with identity. With repetition, lived experience, and real-world usage, the unfamiliar became integrated.
Over time, their voice gained dynamic range in ways it had never expressed before. This was not about copying someone else, but about accessing parts of themselves that had not yet had room to surface. When a voice has space and support, it can expand to reflect who you are more fully.
Phase Clarity: Acquisition vs. Transfer
Knowing where you are in the learning stages, phase clarity, is critical. In the early acquisition phase, the focus is on developing foundational skills: establishing control, building awareness, and exploring new patterns safely. Later, the focus shifts to transfer, the point at which the voice is used in complex, real-world contexts where cognitive load, emotion, and social interaction all come into play. Research in motor learning shows that transfer, not isolated repetition, is what makes new skills stable and functional[3]. Modelling and imitation can support this process, but only when applied intentionally and within a structured learning framework.
Integrating Models Safely and Effectively
Modelling becomes most powerful when it is intentional and focused. Instead of trying to copy someone else’s voice, models can be used to explore specific features such as rhythm, pitch contour, phrase shape, resonance balance, pacing, and strategic pausing.
These elements are often described as suprasegmental features of speech. They shape how speech flows across phrases and sentences rather than focusing only on individual sounds. Research shows that listeners rely heavily on these broader prosodic patterns when forming impressions about gender expression, personality, and emotional tone[7].
When learners isolate and experiment with these features, they begin to build stronger auditory-motor connections. This type of targeted exploration allows safe expansion of vocal range, experimentation with new colours, and gradual development of expressiveness without strain or frustration[2,6].
The key is always alignment with your goals, personality, and everyday communication needs. Voice is not something to replicate. It is a skill to inhabit. Thoughtful use of models helps you access parts of your voice that may have been dormant, unlock new ways of expressing yourself, and integrate those skills into real-world contexts. Over time, exploration becomes mastery, and your voice becomes more dynamic, expressive, and authentically yours.
Conclusion
Voice inspiration can accelerate learning and deepen exploration when used thoughtfully. Fantasy modelling, without structure, risks distraction and discouragement. By framing modelling intentionally, supporting auditory-motor integration, and integrating skill transfer, clients develop voices that are functional, healthy, expressive, and authentically theirs. The key is purposeful exploration, structured learning, and alignment with the individual’s personality, physiology, and communication needs.
If you’re interested in exploring your voice with guided support, you can book an intake appointment or start with a free meet-and-greet to see how our team can help you unlock your expressive voice.
References
Eliades SJ, Wang X. Neural mechanisms of vocal communication. Nature Reviews Neuroscience. 2017.
Guenther FH. Neural control of speech. Cambridge, MA: MIT Press. 2016.
Magill RA, Anderson DI. Motor learning and control: Concepts and applications. 11th edition. New York: McGraw-Hill Education. 2017.
Schmidt RA, Lee TD. Motor control and learning: A behavioral emphasis. 6th edition. Champaign, IL: Human Kinetics. 2011.
Wulf G, Shea CH, Lewthwaite R. Motor skill learning and performance: A review. Frontiers in Psychology. 2010;1:1–12.
Zarate JM, Zatorre RJ. Experience-dependent neural substrates in singing. Annals of the New York Academy of Sciences. 2008.
Pisanski K, Rendall D. The perception of voice pitch and its relation to social judgments. Trends in Cognitive Sciences. 2011.



