Python文本转语音引擎使用方法——pyttsx3

pyttsx3 是一个 Python 文本转语音（TTS）库，支持离线语音合成。它基于跨平台引擎，可在 Windows、Linux 和 macOS 上运行，无需依赖互联网连接。pyttsx3 是 pyttsx 的升级版本，修复了后者的一些问题并优化了功能。虽有以上一些缺点，但pyttsx3 仍是一个功能强大且易于使用的文本转语音库，特别适合需要离线语音合成功能的应用场景，如自动化测试、语音助手开发

weixin_46064243

400人浏览 · 2025-12-23 14:50:09

weixin_46064243 · 2025-12-23 14:50:09 发布

Automation

Python文本转语音引擎使用方法——pyttsx3

什么是Pyttsx3?

什么是Pyttsx3?

pyttsx3 是一个 Python 文本转语音（TTS）库，支持离线语音合成。它基于跨平台引擎，可在 Windows、Linux 和 macOS 上运行，无需依赖互联网连接。pyttsx3 是 pyttsx 的升级版本，修复了后者的一些问题并优化了功能。

核心功能

离线工作 ，不依赖在线 API，直接调用系统本地语音引擎。
多平台支持，兼容 Windows（SAPI5）、Linux（eSpeak）和 macOS（NSSpeechSynthesizer）。
可调节参数 ，支持调整语速、音量和声音类型（性别或特定语音库）。
异步支持 ，允许在后台运行语音合成，避免阻塞主程序。

安装方法

通过 pip 安装：

pip install pyttsx3

基本用法示例

@time_function
def text_to_voice(self, content):
    """Method to send content using pyttsx3
    Arguments:
        content : str : content to be converted to speech
    Returns:
        result : bool : True if successful, False otherwise
    """
    result = False
    try:
        # 初始化TTS引擎
        engine = pyttsx3.init()
        
        # 调整语速 - 设置为170 WPM（每分钟单词数）
        # 默认语速通常是200，这里稍微放慢以便更清晰地识别
        engine.setProperty('rate', 170)
        
        # 设置音量 - 范围从0.0到1.0，这里设为最大音量
        engine.setProperty('volume', 1.0)
        
        # 获取可用的声音列表并选择第一个声音
        # 在不同平台上可能是男声或女声
        voices = engine.getProperty('voices')
        engine.setProperty('voice', voices[0].id)
     
        # 将文本转换为语音并播放
        engine.say(content)
        
        # 等待语音播放完成
        engine.runAndWait()
        
        print(f"Successfully converted text to speech: {content}")
        result = True
    except Exception as error:
        print(f"Exception occurred in Sending content {error}")
        result = False
    return result

pyttsx3 库的主模块核心功能

from .engine import Engine
import weakref

_activeEngines = weakref.WeakValueDictionary()
#导入引擎类
#创建一个弱引用字典来管理活跃的引擎实例

def init(driverName=None, debug=False):
    '''
    Constructs a new TTS engine instance or reuses the existing instance for
    the driver name.

    @param driverName: Name of the platform specific driver to use. If
        None, selects the default driver for the operating system.
    @type: str
    @param debug: Debugging output enabled or not
    @type debug: bool
    @return: Engine instance
    @rtype: L{engine.Engine}
    '''
    try:
        eng = _activeEngines[driverName]
    except KeyError:
        eng = Engine(driverName, debug)
        _activeEngines[driverName] = eng
    return eng

#简单的文本转语音功能
def speak(text):
    engine = init()
    engine.say(text)
    engine.runAndWait()

pyttsx3的缺点

语音质量有限
依赖系统引擎，部分引擎（如eSpeak）的发音机械感明显，缺乏自然度，尤其在非英语语言中表现较差。
功能扩展性弱
不支持高级功能如情感合成、SSML标记或实时音频流处理，适用场景较单一。
维护更新慢
项目更新频率较低，对Python新版本的适配可能存在滞后，长期维护存在不确定性。
多语言支持不足
非英语语言的发音效果和可选声音较少，部分语言需手动配置引擎参数。

pyttsx3的使用总结

虽有以上一些缺点，但pyttsx3 仍是一个功能强大且易于使用的文本转语音库，特别适合需要离线语音合成功能的应用场景，如自动化测试、语音助手开发等。

AI Agent技术社区

Agent 垂直技术社区，欢迎活跃、内容共建。

更多推荐

从Anthropic官方文档看Claude的安全机制：隔离、模型与外部内容的三层防御体系

十二个月前，如果有人提议让Claude拥有足以搞垮Anthropic内部服务的权限，我们一定会断然拒绝。而今天，这种访问级别已经成为常态，Anthropic内部的开发者们正因为这种部署而大幅提升了生产力。这是我读完Anthropic官方工程博客《How we contain Claude across products》（2026年5月25日发布）后的第一感受。当AI Agent的能力越强大，它的