Testing SpeechMate as a speech support system for public speaking and challenging conversations

ISRCTN ISRCTN15824435
DOI https://doi.org/10.1186/ISRCTN15824435
Sponsor University of Oxford
Funders Medical Research Council, Dominic Barker Trust, Engineering and Physical Sciences Research Council
Submission date
03/06/2026
Registration date
05/06/2026
Last edited
05/06/2026
Recruitment status
Not yet recruiting
Overall study status
Ongoing
Condition category
Mental and Behavioural Disorders
Prospectively registered
Protocol
Statistical analysis plan
Results
Individual participant data
Record updated in last year

Plain English summary of protocol

Background and study aims
Public speaking is an important part of everyday life, from meetings and interviews to social and work settings. Many people use different techniques or tools to prepare and deliver their presentations. This study explores how people use different types of digital tools when preparing and delivering presentations, and challenging conversations. This study is interested in understanding how an in-house-developed mobile app called SpeechMate is used during speaking tasks.

The app combines visual and auditory features that synchronise with a user’s own speech to provide a structured presentation experience. This study will help us understand how people interact with such technology and how these tools might be developed for future use in public-speaking contexts.

Who can participate?
Adults aged 18 to 65 years can take part who has native or near-native level of English to complete short speaking tasks and questionnaires, and have normal or corrected to normal hearing and vision.

To take part, participants should:
• Be able to deliver short prepared presentations and complete brief questionnaires
• Have normal or corrected-to-normal hearing and vision

What does the study involve?
Participants will attend one laboratory visit lasting about 60 to 90 minutes at the University of Oxford. They will complete questionnaires and speaking tasks while audio and video recordings are made.

In the public speaking task, adults who stutter and typically fluent speakers will complete two conditions. In one condition, they will read a prepared speech from text shown on augmented reality glasses without a guiding voice. In another condition, they will see the text on augmented reality glasses and hear a guiding voice through bone conduction headphones. They will speak in unison with the guiding voice, and can slow the voice during speech if needed.

Adults with apraxia of speech will complete the task using augmented reality glasses only. They will present two versions of a text.

In the conversation task, participants will complete semi-structured speaking scenarios, such as a job interview, booking a doctor appointment, speaking to an authority figure, ordering in a cafe, and introducing themselves to a stranger. In one condition the scenarios will be completed without SpeechMate guidance. In other scenarios, SpeechMate will generate a suggested reply using non-sensitive background information provided by the participant. The reply will appear on the augmented reality glasses and will also be played as a guiding voice through bone conduction headphones. The participant will then speak in unison with the guiding voice. If they do not want to use the generated reply, they can respond in their own words using a metronome cue as well.

After the laboratory visit, adults who stutter and typically fluent speakers will be asked to use SpeechMate during at least three self-selected speaking situations over one month and complete brief use logs.

What are the possible benefits and risks of participating?
Participants may not benefit directly from taking part. Some participants may find SpeechMate useful or may find the speaking tasks interesting. The study may help the research team understand how speech support systems can be used during public speaking and conversation.

The risks are expected to be low. Some participants may feel tired, or uncomfortable during speaking tasks. Some may find the augmented reality glasses, earbuds, or speaking tasks mildly uncomfortable. Participants can take breaks, or stop taking part at any time. The system is not used to diagnose speech difficulties, make clinical decisions, or replace speech and language therapy.

Where is the study run from?
The Institute of Biomedical Engineering, Department of Engineering Science, University of Oxford, Oxford, UK.

When is the study starting and how long is it expected to run for?
The study is expected to start in June 2026 and recruitment is expected to continue until December 2026. The study is expected to finish in February 2027 after the final one month follow up has been completed.

Who is funding the study?
1. The University of Oxford Medical and Life Sciences Translational Fund
2. Medical Research Council
3. Engineering and Physical Sciences Research Council
4. The Dominic Barker Trust supported earlier development of the SpeechMate research programme

Who is the main contact?
Dr Birtan Demirel, Institute of Biomedical Engineering, Department of Engineering Science, University of Oxford, birtan.demirel@eng.ox.ac.uk

Contact information

Prof Timothy Denison
Principal investigator

University of Oxford, Old Road Campus Research Building, Headington
Oxford
OX3 7DQ
United Kingdom

ORCiD logoORCID ID 0000-0002-5404-4004
Phone +44 01865 617675
Email timothy.denison@eng.ox.ac.uk
Dr Birtan Demirel
Public, Scientific

University of Oxford, Old Road Campus Research Building, Headington
Oxford
OX3 7DQ
United Kingdom

ORCiD logoORCID ID 0000-0002-7295-5143
Phone +44 07721928334
Email birtan.demirel@eng.ox.ac.uk

Study information

Primary study designInterventional
AllocationRandomized controlled trial
MaskingBlinded (masking used)
ControlActive
AssignmentCrossover
PurposeProof of concept evaluation of an experimental speech support system
Scientific titleA proof of concept study of SpeechMate for synchronised speech guidance during public speaking and challenging conversations in adults who stutter, adults with apraxia of speech, and typically fluent adults, measuring percentage of disfluent syllables, speech naturalness, usability, and willingness to public speaking
Study acronymSpeechMate
Study objectives 1. To test SpeechMate as an experimental speech support system that helps participants speak in unison with a guiding voice during public speaking and semi-structured conversations.
2. To compare disfluency, speech naturalness, speech duration, participant preference, and speaking confidence between SpeechMate supported and controlled speaking conditions.
3. To assess feasibility and logged use during self-selected speaking situations over one month outside the laboratory.
4. In addition to the primary analysis in adults who stutter, exploratory speech and participant experience outcomes will be analysed in adults with apraxia of speech and typically fluent adults.
Ethics approval(s)

Approved 18/02/2026, Oxford Central University Research Ethics Committee, Medical Sciences Interdivisional Research Ethics Committee (MS IDREC) (Research Services, University of Oxford, Boundary Brook House, Churchill Drive, Headington, Oxford, OX3 7GB, United Kingdom; +44 01865 616577; ethics@medsci.ox.ac.uk), ref: 2487591

Health condition(s) or problem(s) studiedSpeech production during public speaking and semi-structured conversation, including developmental stuttering, apraxia of speech, and typically fluent speech.
InterventionSpeechMate is an experimental speech support system based on the principle that speaking in unison with an external voice can reduce disfluent syllables in people who stutter. The system combines a mobile application, augmented reality glasses, bone conduction earbuds, adjustable guiding voice, and text adaptation tools. Depending on the task, SpeechMate will provide either visual text support through augmented reality glasses, auditory guidance through bone conduction earbuds, or both. In conditions with auditory guidance, participants will be asked to speak in unison with the guiding voice, either word by word or with a short delay (also known as shadowing). The guiding voice can be slowed by the participant during speech if they anticipate difficulty or need to regain synchrony.

Participants will attend one laboratory visit lasting approximately 60 to 90 minutes. They will complete baseline questionnaires, complete the study speaking tasks, select a guiding voice, and complete practice trials before the main experimental conditions. Practice duration will vary according to participant familiarity with the technology and will be recorded.

In Experiment 1, adults who stutter will complete a structured public speaking task in two conditions. In baseline Teleprompter Mode, the prepared speech text will be displayed on augmented reality glasses without auditory guidance. In SpeechMate Presentation Mode, the prepared speech text will be displayed on augmented reality glasses while the selected guiding voice is delivered through bone conduction earbuds. Participants will speak in unison with the guiding voice and may slow the audio in real time. Condition order will be randomised and counterbalanced across participants using a pre-generated Latin square schedule. The allocation sequence will be created before recruitment using a computer-generated randomisation list, and participants will be assigned to the next available sequence after enrolment.

Adults with apraxia of speech will complete an exploratory Teleprompter Mode task comparing original text with artificial intelligence (AI) simplified text. Both texts will be displayed on augmented reality glasses without auditory guidance. The purpose of this subgroup is to describe speech performance with augmented reality glasses and feasibility when text complexity is changed. Typically fluent adults will complete the SpeechMate tasks as a reference group, allowing speech naturalness, delivery, usability, and participant experience to be compared across groups.

Before Experiment 2, each participant will provide non-sensitive background information, such as interests, hobbies, education, roles, and general self descriptions. SpeechMate will use this information to create a foundational model of the participant for the conversational scenarios. The model will be used to generate replies that are personally relevant rather than generic, and will be stored under the participant’s study identifier.

In Experiment 2, participants will complete semi-structured conversational scenarios modelled on common speaking situations, including a job interview, booking a doctor appointment, speaking to an authority figure, ordering in a cafe, and introducing themselves to a stranger. Each scenario will be completed in two blocks. In the baseline block, participants will respond without SpeechMate guidance. In the assisted block, SpeechMate will generate a suggested reply from the participant’s foundational model, display the reply on augmented reality glasses, and convert it into an auditory guiding cue delivered through bone conduction earbuds. Participants will then speak the reply in unison with the guiding voice, so that the same choral speech principle used in Presentation Mode is tested during conversation. Participants may choose not to use a generated reply and may instead respond spontaneously using metronome-paced speech.

After the laboratory visit, adults who stutter will be asked to use SpeechMate during at least three self-selected speaking situations over one month and complete brief use logs.
Intervention typeBehavioural
Primary outcome measure(s)
  1. Percentage of disfluent syllables, defined as the number of stuttered syllables for each speaking block (e.g., blocks, prolongations, part word repetitions, and whole word repetitions) divided by the total number of syllables, multiplied by 100, measured using audio and video recordings at each laboratory speaking block in Experiment 1 and Experiment 2
Key secondary outcome measure(s)
  1. Speech naturalness measured using audio and video recordings rated by trained raters using a 9-point naturalness scale, where 1 indicates highly natural sounding speech and 9 indicates highly unnatural sounding speech, at each laboratory speaking block in Experiment 1 and Experiment 2
  2. Speech duration, defined as the total time taken to complete each speaking block measured using audio and video recordings at each laboratory speaking block in Experiment 1 and Experiment 2
  3. Missed syllables, defined as the number of syllables omitted from the target text or generated reply during each speaking block measured using audio and video recordings at each laboratory speaking block in Experiment 1 and Experiment 2
  4. Stuttering severity measured using the Stuttering Severity Instrument (SSI-4), Fourth Edition, based on reading and spontaneous speech samples recorded at the laboratory visit before the main experimental conditions
  5. Impact of stuttering on daily communication and quality of life measured using the Overall Assessment of the Speaker’s Experience of Stuttering (OASES) at baseline and after the one month follow up period
  6. Fear of negative evaluation measured using the Fear of Negative Evaluation (FNE) scale at baseline and after the one month follow up period
  7. Willingness to give public presentations measured using participant self-report ratings of willingness to give public presentations at baseline and after the one month follow up period
  8. Participant preference measured using participant self-report preference ratings comparing SpeechMate supported and comparator speaking conditions at a time point after completion of the relevant laboratory study conditions
  9. Articulatory accuracy in adults with apraxia of speech measured using audio and video recordings using apraxia relevant speech ratings during original text and artificial intelligence simplified text conditions, at each Teleprompter Mode text condition in the exploratory apraxia subgroup at the laboratory visit
Completion date01/02/2027

Eligibility

Participant type(s)
Age groupMixed
Lower age limit18 Years
Upper age limit65 Years
SexAll
Target sample size at registration45
Key inclusion criteria1. Adults aged 18 to 65 years
2. Able to speak English with sufficient proficiency to complete prepared speaking tasks, semi-structured conversation tasks, and self report questionnaires
3. Normal or corrected to normal hearing and vision
4. Able to provide written informed consent
5. For the stuttering group: identifies as a person who stutters or has a prior diagnosis of developmental stuttering
6. For the typically fluent group: identifies as a typically fluent speaker
7. For the apraxia of speech group: self reported clinical diagnosis of apraxia of speech
Key exclusion criteria1. Uncorrected hearing or vision impairments that would interfere with the study tasks
2. Any medical or neurological condition that would make wearing smart glasses, bone conduction earbuds, or wearable input devices uncomfortable or unsafe
3. A diagnosed speech, language, or neurological disorder other than developmental stuttering or apraxia of speech, such as dysarthria
4. Inability to complete the speaking tasks or questionnaires in English
5. Inability or unwillingness to provide written informed consent
Date of first enrolment15/06/2026
Date of final enrolment07/12/2026

Locations

Countries of recruitment

  • United Kingdom
  • England

Study participating centre

University of Oxford, Biomedical Engineering
Old Road Campus Research Building, Headington, Oxford OX3 7DQ
Oxford
OX3 7DQ
England

Results and Publications

Individual participant data (IPD) Intention to shareYes
IPD sharing planDe-identified summary data, analysis code, and non-identifiable study materials generated during and/or analysed during the current study may be stored in a publicly available repository after publication, such as OSF or another suitable research repository.

The full raw dataset will not be made publicly available because it may include identifiable speech, facial images, personal background information, and free text responses. Raw audio and video recordings will not be shared as an unrestricted dataset. Identifiable data, including names, contact details, consent forms, and linkage files, will not be shared.

Where participants provide separate explicit consent, selected audio or video recordings may be used for research meetings, teaching, scientific publication, or public outreach, in accordance with the approved consent form and ethics protocol. These recordings will not be shared beyond the scope of the participant’s consent.

Requests for additional de identified participant level data may be considered after publication by contacting Dr Birtan Demirel at birtan.demirel@eng.ox.ac.uk. Access will be considered for bona fide research purposes, such as verification of reported analyses or secondary analyses related to disfluency rates or, speech naturalness. Any sharing will be subject to participant consent, ethical approval where required, University of Oxford data protection procedures, and an appropriate data sharing agreement. Data will be shared only where participants cannot reasonably be identified.

Study outputs

Output type Details Date created Date added Peer reviewed? Patient-facing?
Other publications version 1.0 13/02/2026 04/06/2026 Yes No
Protocol file 05/06/2026 No No

Additional files

49647_Consent_Form_v1.0_13Feb2026.pdf
Other publications
49647_Protocol.pdf
Protocol file

Editorial Notes

05/06/2026: Study’s existence confirmed by the Oxford Central University Research Ethics Committee, Medical Sciences Interdivisional Research Ethics Committee (MS IDREC), UK.