
The Speaker WordPress Plugin converts post or webpage into human-like speech in more than 194 voices across 35+ languages and variants. It applies groundbreaking research in speech synthesis (WaveNet) and Google’s robust neural networks to deliver high-fidelity audio.
Notes
This manual was created for plugins version 2.0.0 and higher. If you use plugins of an older version, use the Speaker settings manual for versions 1.0.^.
To open the plugin settings in the WordPress Sidebar menu go to Speaker

In the plugin settings, there are several tabs responsible for various groups of plugin settings.
- Voice
- Design
- Speakable Markup
- Post Types
- Audio Content
- Storage
- Podcasts
- Assignments
- Activation
- Status
- Updates
- Uninstall
Voice Settings
The voice tab describes the basic settings of the plugin, the selection of voices and features of speech synthesis.
API Key File
When you first run the plugin, you need to connect the Key File. How to get the Key File you can learn from the article About Key File for the Speaker & Voicer WordPress Plugins
Now used
Shows which voice is selected now, and also allows you to listen to the voice sample chosen with default settings.

You can also listen to the Supported Google Voices and Languages in the Google Cloud Documentation.
Language
Using the filter, you can choose from the table the voices you need. After selecting a language, you can filter the table by voice type and gender.

To select and activate a voice, click on the row in the table. The selected voice will be highlighted in blue.
Advanced voice settings
The toggle enables advanced voice and audio file settings. For experienced users.
Audio Format
Select the format in which the audio will be sent: MP3 or WAV. All recordings in other formats will become unavailable. After switching the format in the settings, audio files need to be recreated for each post.
Audio Encoding
The encoding determines the output audio format that we’d like.
- LINEAR16 – uncompressed 16-bit signed little-endian samples (Linear PCM). Audio content returned as LINEAR16 also contains a WAV header.
- MP3 – MP3 audio at 32kbps.
- OGG_OPUS – opus encoded audio wrapped in an ogg container. The result will be a file which can be played natively on Android, and in browsers (at least Chrome and Firefox). The quality of the encoding is considerably higher than MP3 while using approximately the same bitrate.
- MULAW – 8-bit samples that compand 14-bit audio samples using G.711 PCMU/mu-law. Audio content returned as MULAW also contains a WAV header.
- ALAW – 8-bit samples that compand 14-bit audio samples using G.711 PCMU/A-law. Audio content returned as ALAW also contains a WAV header.
Audio Profile
You can optimize the synthetic speech produced by Cloud Text-to-Speech API for playback on different types of hardware. For example, if your app runs primarily on smaller, ‘wearable’ types of devices, you can create a synthetic speech from Cloud Text-to-Speech API that is optimized specifically for smaller speakers.
Speaking Rate/Speed
This setting changes the voice playback speed. The smaller the number, the slower the synthesized voice will speak, and the more the amount will be, the faster will be the speaker. The normal speed is 1. You can select any value in the range from 0.25 to 4. The higher the number, the faster the speech.
Pitch
Speaking pitch, in the range [-20.0, 20.0]. 20 means an increase of 20 semitones from the original pitch. -20 means a decrease of 20 semitones from the original pitch. The default value is 0
Volume Gain
The setting controls the audio gain via the slider. There are available values from -10dB to 15dB.
Sample Rate
Specify the synthesis sample rate.
We recommend a sample rate of at least 16 kHz in the audio files that you use for transcription with Speech-to-Text. Sample rates found in audio files are typically 16 kHz, 32 kHz, 44.1 kHz, and 48 kHz. Because intelligibility is greatly affected by the frequency range, especially in the higher frequencies, a sample rate of less than 16 kHz results in an audio file that has little or no information above 8 kHz. This can prevent Speech-to-Text from correctly transcribing spoken audio. Speech intelligibility requires information throughout the 2 kHz to 4 kHz range, although the harmonics (multiples) of those frequencies in the higher range are also important for preserving speech intelligibility. Therefore, keeping the sample rate to a minimum of 16 kHz is a good practice.
Automatic synthesis
This setting enables automatic speech synthesis when you press the Update button or Publish Button.
Notes
This option can significantly increase the cost of your monthly Google Cloud quota.
Turning on this option every time you make changes to the page and save them all the contents of the old page will be sent for speech synthesizing. For example:
- you created a post of 5000 characters in size and Published it – your content automatically sent to Google Cloud and voiced. 5000 characters will be written off from your balance
- you added one comma (one character) to the text and click Save – 5001 characters will be automatically sent for speech synthesizing. 5001 symbols will be deducted from your balance
Design
The design tab contains settings related to the design of the Player and its surrounding elements.
Player position
You can select one of the available player positions:
- Before Content
- After Content
- Top Fixed
- Bottom Fixed
- Before Title
- After Title
- Before in Custom Filter – the option provides a field for entering a theme filter or a custom filter before which the player will be added.
- After in Custom Filter- the option provides a field for entering a theme filter or a custom filter after which the player will be added.
- WordPress Hook(action) – the option provides a field for entering a custom WordPress action to which the player will be added.
- Shortcode [speaker]
Important
Some themes may have problems displaying the player in Before Title and After Title positions. Switch plugin to other position in such cases.
You can easily add the Speaker WordPress Plugin player anywhere on the page using the shortcode [speaker]
And you can also show the player with an audio recording of another post, for this use the shortcode [speaker id=PAGEID]
PAGEID is the id of the page of the audio version of which you want to use in the player.

You can find out the page ID in the address bar during editing or from the page code.
Style
The plugin has five player styles that you can use on the website pages. You can also customize any of the presented player templates with CSS.
Round Player
Core WordPress player with round edges:

Rounded Player
Core WordPress player with rounded corner:

Squared Player
Core WordPress player with right angles and straight edges:

WordPress Default Player
Regular WordPress audio player:

Chrome Style player

Browser Default Player
Default browser player without any styles and scripts. The appearance of the player is different in different browsers:

The Chrome & Opera Audio Player have a link to download the audio file even when the Download Link is hidden in the plugin settings.


Please note that when choosing a Browser Default Player, a standard WordPress player will be disabled on your site. This will have an effect for all players on all pages of your site.
To fine-tune the design of the player, use the CSS class .mdp-speaker-box
Background color
This setting allows setting the background color of the player. This is possible for all modes except WordPress Default Player and Browser Default Player
Download link
The setting determines how to display the download link. You can choose one of these options:
- Do not show
- Backend Only
- Frontend Only
- Backend and Frontend


Speed controls
The option allows you to enable the control panel for the speed of audio playback. When the option is enabled, two additional settings are available:
- Speeds section title – enter the title that will be display before the speed settings panel.
- Available speeds – a field allows you to enter speed values that will be available to the user. The speed values must be separated by commas. Use period for decimal numbers, for example 1.2, 1.5, 1.75.


Audio Preload
The preload attribute specifies if and how the audio file should be loaded when the page loads.

- None – the browser should NOT load the audio file when the page loads.
- Metadata – the browser should load only metadata when the page loads.
- Auto – the browser should load the entire audio file when the page loads.
- Backend – the browser should NOT load the audio file but the plugin requests information about the file on the backend and saves its duration in HTML.
Content before the player
The toggle enables the enter field for adding a text, image or HTML markup before the audio player.

Description after
The toggle enables the enter field for adding a text, image or HTML markup before the audio player.
Autoplay
The option allows playing audio after page load. Some browsers do not allow autoplay for any audio or video according to their policy, so this feature may not work for certain browsers.
Chrome | FireFox | Opera | Safari | Edge | |
WordPress default player | No | No | Yes | No | Yes |
Browser default player | No | No | Yes | Yes | Yes |
Loop
The option allows looping the audio playback.
Speakable Markup(from version 3.4.0)
The toggle to enable/disable the Speakable markup option for the post/page selected in the Speaker>Post Types.
The speakable
schema.org property identifies sections within an article or webpage that are best suited for audio playback using text-to-speech (TTS). Adding markup allows search engines and other applications to identify content to read aloud on Google Assistant-enabled devices using TTS. Web pages with speakable
structured data can use the Google Assistant to distribute the content through new channels and reach a wider base of users.
The additional options are available when the Speakable Markup toggle is On.
- Markup all posts – an option to enable the markup even for posts without an audio file generated.
- JSON+LD Markup – a field modify JSON+LD markup in regards to the Speakable Markup Guide. The default markup version also works correctly.

Post Types(from version 3.0)
A new tab that contains settings of choosing the types of posts for audio generation and applying speech template as default for every post type.
Post Types
Select the post types, including custom post types to work with the Speaker. The available types are in the drop-down list. Once new post types are added to your site, they will be displayed in the list.

The Speaker can only work with public posts. There are various reasons for this:
- the plugin is GDPR-compliant, which means it can’t use private information.
- the audio of a post/page is generated based on the current page markup available publicly.
If you use password-protected post types that are available only for certain users or a group of users, the Speaker can’t work with them.
Speach Templates
The settings section allows you to apply Speech Templates as default for each of the post types that you select in the Post types field. You can create a new Speech Templates when editing or creating a page/post.

Audio Content
Before Audio
In this text field, you can add text that will be added at the beginning of the audio file. For example, you can add an invitation or copyrights.
After Audio
In this setting, you can add text that will be added to the end of the audio file. For example, you can add farewell or copyrights.
Read the Title
The option allows you to voice the Title of the current post/page when generating audio.
The setting can be applied only to the default Speech Template “Content“
Read the Image Caption
The option allows you to voice the Image caption of the current post/page when generating audio.
RegEx replacements
With this option, you can replace any group of characters when voicing by using Regular Expressions

Enter the regular expression to be replaced and on a new line write the term or SSML tag to be replaced. You can also use the WordPress filter speaker_after_content_regex_replace
to manipulate content for voicing

The shortcodes can not be used in the regex rules. You should use SSML tags.
Storage
Starting from Speaker version 3.3.0, it became available to save audio files not only locally but also on Google Drive disk. The section includes the following Storage settings:
Custom fields
The toggle to enable/disable the feature of saving data about the generated audio file in meta fields has been added. Read more in the guide.
Visible in the Media Library
The toggle to make the audio file visible and available in the Media Library.
Note: The files that were generated before the option was enabled will not be displayed.
Storage
Media library
Installed by default and provides for local saving of the audio files, does not require additional settings.
Library + Google Drive
Allows you to automatically save files both locally and on Google Drive after the audio generation. For this, you need to get your own Google Drive API key file following the guide. Once you get your JSON key file you need to upload it to the Google Drive API key field by drag and drop or upload by clicking on the field.

Once the key file has been successfully added, you need to get your token by allowing access to the disk to save files. After the installation is completed, your audio files will also be saved on Google Drive disk in the “Speaker” folder.

Podcasts(from version 3.4.0)
An option that allows you to create an RSS podcast feed for any podcast service based on audio generated on your site.
Create RSS feed podcasts following the RSS feed guidelines for Google Podcasts,or Apple Podcaster’s Guide to RSS, Spotify Podcast Delivery Specification or other podcast services.
There are available Header and Item fields of the template when the RSS option is enabled.

Once all the configuration is done the feed will be available at the link https://yoursitelink?feed=speaker-podcast
To get a feed only with posts of a certain category, use the link
https://yoursitelink?cat=7,8&feed=speaker-podcast
Where cat = "category ID"
To get a feed only with posts of a certain tag, use the link
https://yoursitelink?tag=tag1,tag2&feed=speaker-podcast
Where tag="tag slug"
Updates
Check Updates
The toggle to enable/disable requests to the update server. When the option is enabled (by default), the plugin sends requests to our server when activating and updating our plugin.
You can turn off this setting if for some reason you need to disable requests to our server. In this case, automatic plugin updates will not be available but all the plugin functionality will be saved.
Please note: the plugin must be activated via your purchase code before the Check Updates feature is disabled.
Clear cache data
A button to clear cache data. Can be used when there is a problem with updates or activation.
Save Changes
Do not forget to click Save Changes after changing plugin settings. After clicking this button, the settings will be saved and applied to all pages of the website.
Now you can proceed to use the plugin on the pages of your WordPress website. Please read the article Converting WordPress page to speech and Speech Synthesis Markup Language (SSML) in the Speaker WordPress plugin.
Difference between Speaker and Voicer plugins
Hosting
Fast and reliable hosting is significant for any WordPress site. We recommend all our customers use SiteGround WordPress Hosting. Many unique settings and features make this hosting the number 1 for WordPress: Free Website Transfer, Staging Tools, Free SSL, CDN, and much more for 3.95/mo.