Speaker 2.0+ WordPress Plugin settings

The Speaker WordPress Plugin converts post or webpage into human-like speech in more than 194 voices across 35+ languages and variants. It applies groundbreaking research in speech synthesis (WaveNet) and Google’s robust neural networks to deliver high-fidelity audio.

Notes

This manual was created for plugins version 2.0.0 and higher. If you use plugins of an older version, use the Speaker settings manual for versions 1.0.^.

To open the plugin settings in the WordPress Sidebar menu go to Speaker

Speaker WordPress Plugin Settings
Speaker Settings

In the plugin settings, there are several tabs responsible for various groups of plugin settings.

Voice Settings

The voice tab describes the basic settings of the plugin, the selection of voices and features of speech synthesis.

API Key File

When you first run the plugin, you need to connect the Key File. How to get the Key File you can learn from the article About Key File for the Speaker & Voicer WordPress Plugins

Now used

Shows which voice is selected now, and also allows you to listen to the voice sample chosen with default settings.

Now used preview
Now used player

You can also listen to the Supported Google Voices and Languages in the Google Cloud Documentation.

Language

Using the filter, you can choose from the table the voices you need. After selecting a language, you can filter the table by voice type and gender.

The Speaker voice list
The Speaker voice list

To select and activate a voice, click on the row in the table. The selected voice will be highlighted in blue.

Advanced voice settings

The toggle enables advanced voice and audio file settings. For experienced users.

Audio Format

Select the format in which the audio will be sent: MP3 or WAV. All recordings in other formats will become unavailable. After switching the format in the settings, audio files need to be recreated for each post.

Audio Encoding

The encoding determines the output audio format that we’d like.

  • LINEAR16 – uncompressed 16-bit signed little-endian samples (Linear PCM). Audio content returned as LINEAR16 also contains a WAV header.
  • MP3 – MP3 audio at 32kbps.
  • OGG_OPUS – opus encoded audio wrapped in an ogg container. The result will be a file which can be played natively on Android, and in browsers (at least Chrome and Firefox). The quality of the encoding is considerably higher than MP3 while using approximately the same bitrate.
  • MULAW – 8-bit samples that compand 14-bit audio samples using G.711 PCMU/mu-law. Audio content returned as MULAW also contains a WAV header.
  • ALAW – 8-bit samples that compand 14-bit audio samples using G.711 PCMU/A-law. Audio content returned as ALAW also contains a WAV header.
Audio Profile

You can optimize the synthetic speech produced by Cloud Text-to-Speech API for playback on different types of hardware. For example, if your app runs primarily on smaller, ‘wearable’ types of devices, you can create a synthetic speech from Cloud Text-to-Speech API that is optimized specifically for smaller speakers.

Speaking Rate/Speed

This setting changes the voice playback speed. The smaller the number, the slower the synthesized voice will speak, and the more the amount will be, the faster will be the speaker. The normal speed is 1. You can select any value in the range from 0.25 to 4. The higher the number, the faster the speech.

Pitch

Speaking pitch, in the range [-20.0, 20.0]. 20 means an increase of 20 semitones from the original pitch. -20 means a decrease of 20 semitones from the original pitch. The default value is 0

Volume Gain

The setting controls the audio gain via the slider. There are available values from -10dB to 15dB.

Sample Rate

Specify the synthesis sample rate.

We recommend a sample rate of at least 16 kHz in the audio files that you use for transcription with Speech-to-Text. Sample rates found in audio files are typically 16 kHz, 32 kHz, 44.1 kHz, and 48 kHz. Because intelligibility is greatly affected by the frequency range, especially in the higher frequencies, a sample rate of less than 16 kHz results in an audio file that has little or no information above 8 kHz. This can prevent Speech-to-Text from correctly transcribing spoken audio. Speech intelligibility requires information throughout the 2 kHz to 4 kHz range, although the harmonics (multiples) of those frequencies in the higher range are also important for preserving speech intelligibility. Therefore, keeping the sample rate to a minimum of 16 kHz is a good practice.

Automatic synthesis

This setting enables automatic speech synthesis when you press the Update button or Publish Button.

Notes

This option can significantly increase the cost of your monthly Google Cloud quota.

Turning on this option every time you make changes to the page and save them all the contents of the old page will be sent for speech synthesizing. For example:

  • you created a post of 5000 characters in size and Published it – your content automatically sent to Google Cloud and voiced. 5000 characters will be written off from your balance
  • you added one comma (one character) to the text and click Save – 5001 characters will be automatically sent for speech synthesizing. 5001 symbols will be deducted from your balance

Design

The design tab contains settings related to the design of the Player and its surrounding elements.

Player position

You can select one of the available player positions:

  • Before Content
  • After Content
  • Top Fixed
  • Bottom Fixed
  • Before Title
  • After Title
  • Before in Custom Filter – the option provides a field for entering a theme filter or a custom filter before which the player will be added.
  • After in Custom Filter- the option provides a field for entering a theme filter or a custom filter after which the player will be added.
  • WordPress Hook(action) – the option provides a field for entering a custom WordPress action to which the player will be added.
  • Shortcode [speaker]
Important

Some themes may have problems displaying the player in Before Title and After Title positions. Switch plugin to other position in such cases.

You can easily add the Speaker WordPress Plugin player anywhere on the page using the shortcode [speaker]And you can also show the player with an audio recording of another post, for this use the shortcode [speaker id=PAGEID] PAGEID is the id of the page of the audio version of which you want to use in the player.

Speaker WordPress Plugin
Page ID

You can find out the page ID in the address bar during editing or from the page code.

Style

The plugin has five player styles that you can use on the website pages. You can also customize any of the presented player templates with CSS.

Round Player

Core WordPress player with round edges:

Round Speaker Player
Round Speaker Audio Player
Rounded Player

Core WordPress player with rounded corner:

Rounded Speaker Audio Player
Rounded Speaker Audio Player
Squared Player

Core WordPress player with right angles and straight edges:

Squared Speaker Audio Player
Squared Speaker Audio Player
WordPress Default Player

Regular WordPress audio player:

WordPress Default Audio Player
WordPress Default Audio Player
Chrome Style player
Browser Default Player

Default browser player without any styles and scripts. The appearance of the player is different in different browsers:

Chrome & Opera Audio Player

The Chrome & Opera Audio Player have a link to download the audio file even when the Download Link is hidden in the plugin settings.

Safari Audio Player
FireFox Audio Player

Please note that when choosing a Browser Default Player, a standard WordPress player will be disabled on your site. This will have an effect for all players on all pages of your site.

To fine-tune the design of the player, use the CSS class .mdp-speaker-box

Background color

This setting allows setting the background color of the player. This is possible for all modes except WordPress Default Player and Browser Default Player

The setting determines how to display the download link. You can choose one of these options:

  • Do not show
  • Backend Only
  • Frontend Only
  • Backend and Frontend
Speaker WordPress Plugin Download link
Download link on the Frontend
Speaker WordPress Plugin Download link
Download link on the Backend

Speed controls

The option allows you to enable the control panel for the speed of audio playback. When the option is enabled, two additional settings are available:

  • Speeds section title – enter the title that will be display before the speed settings panel.
  • Available speeds – a field allows you to enter speed values that will be available to the user. The speed values must be separated by commas. Use period for decimal numbers, for example 1.2, 1.5, 1.75.
Backend Speed settings
Audio speed control on frontend

Audio Preload

The preload attribute specifies if and how the audio file should be loaded when the page loads.

The audio preload options
  • None – the browser should NOT load the audio file when the page loads.
  • Metadata – the browser should load only metadata when the page loads.
  • Auto – the browser should load the entire audio file when the page loads.
  • Backend – the browser should NOT load the audio file but the plugin requests information about the file on the backend and saves its duration in HTML.

Content before the player

The toggle enables the enter field for adding a text, image or HTML markup before the audio player.

The toggle to display content before the player

Description after

The toggle enables the enter field for adding a text, image or HTML markup before the audio player.

Autoplay

The option allows playing audio after page load. Some browsers do not allow autoplay for any audio or video according to their policy, so this feature may not work for certain browsers.

ChromeFireFoxOperaSafariEdge
WordPress default playerNoNoYesNoYes
Browser default playerNoNoYesYesYes
The table with listed browsers that support autoplay feature

Loop

The option allows looping the audio playback.

Speakable Markup(from version 3.4.0)

The toggle to enable/disable the Speakable markup option for the post/page selected in the Speaker>Post Types.

The speakable schema.org property identifies sections within an article or webpage that are best suited for audio playback using text-to-speech (TTS). Adding markup allows search engines and other applications to identify content to read aloud on Google Assistant-enabled devices using TTS. Web pages with speakable structured data can use the Google Assistant to distribute the content through new channels and reach a wider base of users.

The additional options are available when the Speakable Markup toggle is On.

  • Markup all posts – an option to enable the markup even for posts without an audio file generated.
  • JSON+LD Markup – a field modify JSON+LD markup in regards to the Speakable Markup Guide. The default markup version also works correctly.
Speakable markup settings

Post Types(from version 3.0)

A new tab that contains settings of choosing the types of posts for audio generation and applying speech template as default for every post type.

Post Types

Select the post types, including custom post types to work with the Speaker. The available types are in the drop-down list. Once new post types are added to your site, they will be displayed in the list.

Post Types list of the Speaker

The Speaker can only work with public posts. There are various reasons for this:

  • the plugin is GDPR-compliant, which means it can’t use private information.
  • the audio of a post/page is generated based on the current page markup available publicly.

If you use password-protected post types that are available only for certain users or a group of users, the Speaker can’t work with them.

Speach Templates

The settings section allows you to apply Speech Templates as default for each of the post types that you select in the Post types field. You can create a new Speech Templates when editing or creating a page/post.

Default Speech Templates of the Speaker

Audio Content

Before Audio

In this text field, you can add text that will be added at the beginning of the audio file. For example, you can add an invitation or copyrights.

After Audio

In this setting, you can add text that will be added to the end of the audio file. For example, you can add farewell or copyrights.

Read the Title

The option allows you to voice the Title of the current post/page when generating audio.

The setting can be applied only to the default Speech Template “Content

Read the Image Caption

The option allows you to voice the Image caption of the current post/page when generating audio.

RegEx replacements

With this option, you can replace any group of characters when voicing by using Regular Expressions

RegEx replacements settings

Enter the regular expression to be replaced and on a new line write the term or SSML tag to be replaced. You can also use the WordPress filter speaker_after_content_regex_replace to manipulate content for voicing

An example of replacing a number from 1 to 9 with the letter “a”

The shortcodes can not be used in the regex rules. You should use SSML tags.

Storage

Starting from Speaker version 3.3.0, it became available to save audio files not only locally but also on Google Drive disk. The section includes the following Storage settings:

Custom fields

The toggle to enable/disable the feature of saving data about the generated audio file in meta fields has been added. Read more in the guide.

Visible in the Media Library

The toggle to make the audio file visible and available in the Media Library.

Note: The files that were generated before the option was enabled will not be displayed.

Storage

Media library

Installed by default and provides for local saving of the audio files, does not require additional settings.

Library + Google Drive

Allows you to automatically save files both locally and on Google Drive after the audio generation. For this, you need to get your own Google Drive API key file following the guide. Once you get your JSON key file you need to upload it to the Google Drive API key field by drag and drop or upload by clicking on the field.

Once the key file has been successfully added, you need to get your token by allowing access to the disk to save files. After the installation is completed, your audio files will also be saved on Google Drive disk in the “Speaker” folder.

API key and token are successfully installed

Podcasts(from version 3.4.0)

An option that allows you to create an RSS podcast feed for any podcast service based on audio generated on your site.

Create RSS feed podcasts following the RSS feed guidelines for Google Podcasts,or Apple Podcaster’s Guide to RSSSpotify Podcast Delivery Specification or other podcast services.

There are available Header and Item fields of the template when the RSS option is enabled.

Podcasts feed settings

Once all the configuration is done the feed will be available at the link https://yoursitelink?feed=speaker-podcast

To get a feed only with posts of a certain category, use the link

https://yoursitelink?cat=7,8&feed=speaker-podcast

Where cat = "category ID"

To get a feed only with posts of a certain tag, use the link

https://yoursitelink?tag=tag1,tag2&feed=speaker-podcast

Where tag="tag slug"

Updates

Check Updates

The toggle to enable/disable requests to the update server. When the option is enabled (by default), the plugin sends requests to our server when activating and updating our plugin.

You can turn off this setting if for some reason you need to disable requests to our server. In this case, automatic plugin updates will not be available but all the plugin functionality will be saved.

Please note: the plugin must be activated via your purchase code before the Check Updates feature is disabled.

Clear cache data

A button to clear cache data. Can be used when there is a problem with updates or activation.

Save Changes

Do not forget to click Save Changes after changing plugin settings. After clicking this button, the settings will be saved and applied to all pages of the website.

Now you can proceed to use the plugin on the pages of your WordPress website. Please read the article Converting WordPress page to speech and Speech Synthesis Markup Language (SSML) in the Speaker WordPress plugin.

Difference between Speaker and Voicer plugins

Hosting

Fast and reliable hosting is significant for any WordPress site. We recommend all our customers use SiteGround WordPress Hosting. Many unique settings and features make this hosting the number 1 for WordPress: Free Website Transfer, Staging Tools, Free SSL, CDN, and much more for 3.95/mo.

Was this article helpful to you?