versão impressa ISSN 1809-9777versão On-line ISSN 1809-4864
Int. Arch. Otorhinolaryngol. vol.18 no.3 São Paulo 2014
The various causes of dysphonia cannot be properly differentiated without knowledge of the anatomy and physiology of phonation. Furthermore, visualization of the larynx and vocal folds vibration is essential to the diagnosis. Alterations in vocal fold vibration can contribute to the development of laryngeal pathologies or can be the result of such pathologies, having, in either case, a direct impact on the acoustic quality of the voice.1
Vocal folds vibrate at a frequency of 80 to 1,000 cycles per second.2 As the human eye is capable of perceiving no more than five images per second,3 it is impossible to evaluate the vibration of the vocal folds during phonation.
According to Talbot's law, an image projected on the retina will persist for ∼0.2 seconds. When successive images are presented at intervals of less than 0.2 seconds, they merge and our retina sees the movement as stationary.4 5 Based on these principles, videolaryngostroboscopy has been proved to be an essential tool to infer vocal fold vibration. Videostroboscopy involves pulsing light at a frequency that is of 0.5 Hz or 1.5 Hz lower than the fundamental frequency. As the frequency of the flashes is slightly less than the vibration of the vocal fold, it causes a delay in the portion of each vibratory cycle illuminated, and the illusion of slow motion is obtained, clearly showing the four phases of the glottal cycle.2 6
Despite being the most widely used method in routine clinical practice, videostroboscopy has some limitations. For the strobe light and fundamental frequency to be synchronized, vocal fold vibration must be relatively periodic. In addition, as it represents a subsampling of several vibrational cycles, it is not possible to access the variations between and within cycles. Furthermore, videostroboscopy is not capable of recording the onset and offset of phonation.
The advent of high-speed videolaryngoscopy (HSV) has made it possible to record the details of vocal fold vibration in real time, allowing us to further our knowledge about the normal phonation mechanism and to understand vibrational characteristics in a patient's aperiodic vibration pattern. The method allows for images to be recorded at a speed of up to 4,000 frames per second (fps).
Documenting vibrational cycles and the dynamic characteristics of vocal folds during phonation allows the vibrational movement to be seen in detail, regardless of the regularity or irregularity of its behavior. In addition, HSV can visualize vocal onset and offset, phonatory breaks, and laryngospasm, which occurs too quickly to be capture by videostroboscopy and where significant aperiodicity occurs as normal phenomenon.7 8
One normophonic individual and four patients with different pathologies were invited to participate in the study. The diagnoses (vocal fold nodules, retention cyst, left unilateral paralysis, and adductor spasmodic dysphonia) had been previously established by videolaryngostroboscopy and through an auditory-perceptual voice assessment conducted by two speech therapists, both of whom were voice experts, and three laryngologists. The study was approved by the Research Ethics Committee of the University of São Paulo School of Medicine Hospital das Clínicas, located in the city of São Paulo, Brazil (protocol no. 0767/09).
As shown in Fig. 1, the HSV system employed (Richard Wolf, Knittlingen, Germany) comprised a 90-degree rigid laryngoscope (model HRES 8454.002), camera (model HRES ENDOCAM 5562), 300-W xenon light source (model AUTO LP 5132), and integrated microphone (model 5052.801).
Each participant underwent an HSV of the larynx. The tests were run with the subject sitting, having been previously anesthetized with 10% lidocaine, according to usual laryngoscopy procedures with a rigid telescope. The subject was advised to sustain the vowel /e/ at a comfortable intensity and average fundamental frequency during modal emission. The images were recorded every 2 seconds at a sampling speed of 4,000 fps. For storing the data and generating digital videokymographic images, we used the HRES ENDOCAM software (Richard Wolf).
Glottal closure—defined as the shape of the free margins at maximum glottic closure, which was categorized as complete or incomplete. When incomplete, the shapes of chinks were described as hourglass, anterior glottic chink, posterior triangular glottic chink, midposterior triangular chink, fusiform, or irregular.
Vibration symmetry—compares the left and right vocal folds in terms of their mechanical properties. The vibration of one vocal fold should be a mirror image of the contralateral fold. Vibration was categorized as symmetric or asymmetric.
Peridiocity—refers to the regularity in the duration of successive cycles of vibration. Vibration was classified as regular or irregular by visually assessing successive glottal cycles.
Mucosal wave—defined as the propagation of the epithelium and superficial layer of the lamina propria from the lower to the upper surface during phonation. This parameter was classified as normal, reduced, absent, or increased.
Amplitude—defined as the horizontal excursion from midline. Normal vibration amplitude was defined as when the free margin moves away by approximately half to one third of visible width of the true cords. This parameter was classified as normal, increased, or reduced.
Glottal cycle phases—divided into opening, open, closing, and closed. The time of the closed phase was considered normal when it was ∼50%.
Figs. 2 to 4 were obtained from an individual with no voice-related complaints. Fig. 2 shows frame-by-frame sequential images of the vocal folds, recorded by the high-speed camera during a single glottal cycle. The sampling rate was 4,000 fps. The interval between frames was 0.25 milliseconds. The duration of the cycle, calculated by counting the frames, was 6 milliseconds. On the basis of this information, the fundamental frequency of phonation in this individual was estimated at ∼167 Hz.
In the two-dimensional assessment of the glottal cycle frame sequence, complete glottic closure was observed, as was normal vibration amplitude and symmetry between vocal folds. The propagation of the mucosal wave was within normal parameters.
Fig. 3 shows the videokymographic images of the middle third of the glottis with 46.75 milliseconds of information. Eight complete cycles were observed. On the basis of those data, we inferred a fundamental frequency of 171 Hz, similar to the value obtained by counting the two-dimensional frames. The sequential images of the vibrational cycles allowed us classify the vibration as regular.
Fig. 4 is an augmented videokymographic image, which allowed us to assess the vibrational cycles and the four phases that make up each cycle: closed, opening, open, and closing. In each of those cycles, we evaluated vibration amplitude; phase symmetry; closed phase duration; vertical difference between lower and upper lip; and closing.
Fig. 5 shows the successive two-dimensional images recorded in the patient with vocal fold nodules. The images represent a single glottal cycle. In the first frame, we are able to see the closed phase at the moment just before the opening phase begins. In other words, when subglottic pressure was higher than myoelastic force, the vocal folds began to move away from the midline. Glottic closure was incomplete, with a posterior triangular chink associated to the anterior fusiform chink forming. The test allowed us to observe the asymmetry of the vibration amplitude. In this patient, the mediolateral movement of the free margin was greater for the left vocal fold than for the right. Although there was appropriate mucosal wave excursion on both sides, it was more pronounced on the left.
The videokymography revealed asymmetry during the vibration cycle. Nevertheless, the vibration was regular. The digital videokymographic image of the middle third of the true vocal cord, where there was nodular thickening, shows overlapping images and the greatest contact area during vibration at this point. There were no contact points along the anterior vocal fold (Fig. 6).
In the patient with an epidermoid cyst, HSV allowed us to assess the impact of the cystic lesion on the dynamics of vocal fold vibration (Fig. 7). In the right vocal fold, the lesion impeded the excursion of the mucosal wave. In the contralateral vocal fold, amplitude was adequate, and there was an appropriate mucosal wave. The lesion led to incomplete glottic closure. We also observed phase asymmetry in the right vocal fold when compared with the anterior and middle thirds of the left vocal fold, as well as phase difference between the vocal folds.
The digital videokymographic image shown in Fig. 8 corroborates the findings described above. The scan of the midpoint of the vocal folds shows that the free margin of the right vocal fold was “frozen.” On the other hand, the mucosal wave excursion and the amplitude of the contralateral vocal fold were within the limits of normality. In the scan of the anterior third of the folds, we observed interference with the vibration of the left fold, with less vibration amplitude and incomplete closure. The anterior third of the left vocal fold vibrated at a different timing than did the middle third, indicating phase asymmetry within the same vocal fold, as was observed in the two-dimensional images.
Figs. 9 and 10 represent the HSV images on a patient with paralysis of the left vocal fold. The two-dimensional images recorded by the high-speed camera were analyzed frame by frame. The first frame shows vocal fold adduction at maximum glottic closure. The glottic closure was not complete, and there was an evident irregular chink. In addition, there was phase asymmetry between the right and left vocal fold, together with greater amplitude on the left and increased mucosal wave propagation on the left. The paralyzed vocal fold showed a phase difference when compared with the anterior and middle third.
Through digital videokymography, the regular vibration of the right vocal fold was observed. The free margin shows the profile in the shape of a spicule. The temporal visualization of the glottal cycles shows that the vibration of the paralyzed vocal fold was irregular and its amplitude was augmented. In some cycles, contact was observed along the middle third of the vocal folds. In those cycles, the duration of the closed phase was extremely short.
Fig. 11 shows the patient with adductor spasmodic dysphonia. The initial assessment of successive vibrational cycles showed full glottic closure, reduction of vibration amplitude and of the propagation of the mucosal wave, as well as a longer duration of the closed phase in relation to the total duration of the cycle.
Temporal assessment of the glottal cycles through digital videokymography showed different vibrational patterns. A spasm was observed in detail (Figs. 11A and 11B). Before the spasm began, there was a certain regularity to the cycle and the various phases of the glottal cycle were identifiable. Then, there was an increase in duration in the closed phase of the glottal cycle, followed by eight irregular cycles and a period of spastic contraction of the vocal folds, during which it was not possible to differentiate among the glottic cycle phases.
The temporal analysis of the cycles allowed us to conclude that this patient experienced several spasms or oscillating breaks during the 2 seconds recorded.
Videostroboscopy is a major complement to clinical history taking and physical examination. In 10 to 30% of cases, the videostroboscopy findings lead to a change in the diagnosis and treatment of patients with dysphonia.10 However, abnormal findings were reported in up to 58% of asymptomatic voice professionals with no vocal pathologies, showing the importance of a careful analysis of the test as a means of screening for vocal pathologies.11 The findings of this examination should be assessed in conjunction with those of the history taking and auditory-perceptual assessment, rather than in isolation.
As previously mentioned, the greatest disadvantage of videostroboscopy is its inability to register the fundamental frequency in irregular cycles. The advent of high-speed cameras has enabled the assessment of vocal fold vibration in irregular cycles, at the onset and offset of phonation, as well as intra- and between-cycle evaluation.
High-speed cameras allow images to be recorded at a speed of up to 4,000 fps. The camera used in the present study is capable of recording images at two different speeds: 2,000 and 4,000 fps. The recorded events are viewed at 15 to 30 fps, so that details of vocal fold vibration may be observed. Therefore, a 2-second high-speed recording of vocal fold vibration produces a video of ∼9 minutes in length at 30 fps.
Several studies have assessed the contribution of high-speed cameras to clinical practice, some even comparing the use of videostroboscopy to that of HSV.12 13 14 15 However, the lack of data prevents us from predicting for which patients videolaryngoscopy is essential.
The first case evaluated in this study was that of a woman with no laryngeal alteration or voice-related complaints. The vibration was regular and there was symmetry in the vibrational phase, similar to the standard stroboscopic description of normal vibration.9 The sequential assessment of vibrational cycles with digital videokymography allowed more rapid investigation of this parameter, given that assessing successive cycles through digital videolaryngoscopy is impractical from the clinical point of view, because of the time needed for analyzing the images.
When comparing videostroboscopy and HSV in relation to regularity, it was observed that 30% of patients assessed using videostroboscopy showed moments of irregular vibration, compared with only 4% when HSV was used.14 Although the vibration was symmetric in case 1, individuals with no voice-related complaints can have periods of irregular vibration.14 16
In the patient with vocal nodules, there was asymmetry in the vibration cycle between the middle third and anterior third of vocal folds. Asymmetry in the vibration cycle in the middle third of the vocal folds is justified by the presence of secretion, resulting from greater collision forces in this area during phonation. The presence of mucus induces irregular vibration.17 Phase asymmetry is more evident in the anterior third, as the presence of the fusiform anterior chink potentiates the irregularity of the vibration.18
The two-dimensional images of the vibration in the patient diagnosed with paralysis of the left vocal fold showed incomplete glottic closure. The presence of a glottic chink interferes with vocal fold vibration, leading to irregular glottic pulses.19 By comparing the mediolateral movement of the free margin of each vocal fold, we observed that the mucosal wave was initially triggered in the left fold. The beginning of vibration probably occurred in that vocal fold due to a lack of muscular tonus resulting from denervation. Glottic resistance in relation to transglottic airflow was reduced. Left vocal fold bowing occurred due to the volume deficiency resulting from the muscle atrophy, a characteristic of laryngeal paralysis.
By analyzing the mediolateral movement of each vocal fold, we found that the amplitude of vibration was apparently lower in the left vocal fold than in the right. This might be due to the fact that the free margin of the right vocal fold is more lateralized at the beginning of vibration. Because of the arching of the paralyzed (right) vocal fold, the excursion began at a lateral point in relation to the midline. If the midline is considered the reference point, the left vocal fold presented higher amplitude, as was also seen in the two-dimensional image.
The videokymographic image of the right vocal fold shows the spicule shape of free margin. This profile can be explained by the fact that, because the right vocal fold is displaced toward the midline (Bernoulli effect), it does not meet the wall of the contralateral vocal fold, which is paralyzed. In some cycles there is contact between the middle third and the free margins. However, the duration of the closed phase is extremely short.
Comparing these findings with those of the normophonic individual (case 1), it is observed that in vocal folds with no alterations, the outline of the free margin during normal vibration is rounded and well defined, making it difficult to provide an individual profile of each vocal fold during the closed phase of the glottal cycle. During the closed phase, the biomechanical properties of each vocal fold and the Bernoulli effect triggered excursion toward the midline, as far as the contralateral vocal fold, which functions as a wall.
The frame sequence shows the phases of a single glottal cycle. Through the mediolateral movement of the free margin, it is possible to see that the excursion of the mucosal wave is triggered initially in the left fold. The beginning of vibration occurs in this vocal fold due to a lack of muscular tonus resulting from denervation. Glottic resistance in relation to transglottic airflow is reduced. Bowing of the left vocal fold occurs due to the volume deficiency resulting from muscle atrophy, a characteristic of laryngeal paralysis.
In the patient with cystic lesion in the vocal fold, the lesion was found to interfere with vocal fold vibration. The cyst impeded full glottic closure, leading to turbulence and subglottic air leak. The difference in mass between vocal folds and the consequent difference in glottic resistance led to different durations of the opening and closing phases in each vocal fold, interfering with the laminar flow and producing noise. As in the videostroboscopy, the cyst was found to cause a reduction in vibration amplitude, incomplete glottic closure, and asymmetry in the vibration cycle between the vocal folds.20 Different from videostroboscopy, HSV allowed phase asymmetry to be observed within the same vocal fold. The cyst in the right vocal fold interfered with the vibration dynamics of the contralateral fold. Phase asymmetry in the same vocal fold is a detail that cannot be seen with videostroboscopy.
Videolaryngoscopy has limitations for the differential diagnosis between spasmodic dysphonia and dysphonia caused by muscle–bone tension.13 In both conditions, there are pitch breaks and spasms that involve irregular vibration, making it impossible to synchronize the stroboscopic signal. In addition, supraglottic constriction prevents the visualization of the glottis.
Other diagnosis methods (flexible videoendoscopy, auditory-perceptual assessment, aerodynamic assessment, and electroglottography) also present limitations in differentiating between the two pathologies.
HSV allows for pitch breaks to be assessed and quantified, therefore representing a new tool in the diagnostic arsenal. Kinematic oscillating breaks, defined as a complete interruption of vocal fold movement during a speech task, are highly specific for spasmodic dysphonia and may be used as a criterion in the differential diagnosis with dysphonia caused by muscle–bone tension.13 In the present study, the patient with spasmodic dysphonia presented several pitch breaks, which are suggestive of the diagnosis.
In the present study, the images recorded with HSV allowed us to make a subjective analysis of vocal fold vibration and to interpret the findings on the basis of our previous knowledge of the anatomy and physiology of the larynx.
Despite the limitation of this new diagnostic tool (high cost, time-consuming for image analyses, requires a rigid telescope), the contribution of HSV goes beyond the subjective and isolated interpretation of the vocal fold vibration of each individual shown in this study. In recent years, various studies have been conducted with the aim of standardizing normality parameters and comparing them with data collected from patients with dysphonia.14 15 21 22 23 24 25 Studies of image segmentation and processing methods have quantified the data collected and have identified a relationship between vocal fold vibration and the acoustic characteristics of the voice.26 27 HSV is also used for the objective quantification of surgical outcomes and the evaluation of the effectiveness of speech therapy.
HSV is the latest diagnostic tool in visual examination of vocal behavior and has considerable potential to refine our knowledge regarding the vocal fold vibration and voice production, as well as regarding the impact that pathologic conditions have on the mechanism of phonation. Through high-speed image sampling and adequate spatial resolution, objective parameters may be extracted by applying quantification algorithms. However, the contribution these data make to clinical practice has yet to be fully investigated.