Automatic Detection and Rating of Dementia of Alzheimer Type through Lexical Analysis of Spontaneous Speech Calvin Thomas, Vlado Keˇselj, and Nick Cercone Faculty of Computer Science Faculty of Medicine Saint Mary's University Abstract— Current methods of assessing dementia of
and in normal conversation [5]. In developing new tests, Alzheimer type (DAT) in older adults involve structured in-
researchers should look for automatic and objective methods terviews that attempt to capture the complex nature of deficits
for use in rating dementia in patients through analysis of suffered. One of the most significant areas affected by the disease
is the capacity for functional communication as linguistic skills
spontaneous speech that overcome the shortfalls of current break down. These methods often do note capture the true
methods [6]. Research advances in the areas of discourse nature of language deficits in spontaneous speech. We address
analysis, language modeling and text classification may be this issue by exploring novel automatic and objective methods
applicable to this area and may lead to such progress.
for diagnosing patients through analysis of spontaneous speech.
In this paper, we detail several lexical approaches to the We detail several lexical approaches to the problem of
detecting and rating DAT. The approaches explored rely on
problem of detecting and rating DAT in patients from our character n-gram-based techniques, shown recently to perform
corpus. The large corpus used in our research consists of successfully in a different, but related task of automatic au-
transcripts from the Atlantic Canada Alzheimer's Disease thorship attribution. We also explore the correlation of usage
Investigation of Expectations (ACADIE) study of the drug frequency of different parts of speech and DAT. We achieve
donepezil [7]. The goal of this research is to explore whether a high 95% accuracy of detecting dementia when compared
with a control group, and we achieve 70% accuracy in rating
automatic techniques based on the analysis of spontaneous dementia in two classes, and 50% accuracy in rating dementia
speech can provide objective measures of dementia levels in into four classes.
AD patients. It is our hope that improvements in automatic Our results show that purely computational solutions offer
techniques will extend what is understood about the effects a viable alternative to standard approaches to diagnosing the
of dementia in Alzheimer's patients and the breakdown of level of impairment in patients. These results are significant step
forward toward automatic and objective means to identifying
language faculties.
early symptoms of DAT in older adults.
The research discussed in this paper includes natural lan- Index Terms— Automatic diagnostics, machine learning, nat-
guage processing and machine learning techniques that were ural language processing
applied to the problem of rating DAT in older adults. This interdisciplinary area brings opportunities for novel research to be conducted with generic text classification algorithms.
Current methods of assessing dementia of Alzheimer type Also explored are novel extensions to existing techniques that (DAT) in older adults involve structured interviews that were developed to address specific qualities inherent to the attempt to capture the complex nature of deficits suffered.
corpus analyzed.
One of the most significant areas affected by the disease In short, we found that purely computational solutions is the capacity for functional communication as linguistic offer a viable alternative to standard approaches to diagnosing skills break down. With this fact in mind, interviews are the level of impairment in patients. Although more work designed to test linguistic abilities, including confrontation needs to be done to improve the accuracy of these methods, naming [1], single word production [2] or word generation these results are significant step forward towards automatic given context [3]. However, these methods sometimes fail and objective means to identifying early symptoms of DAT to identify early symptoms observed by family members in older adults.
during normal conversation [4], and often fail to describe adequately the level of impairment in low scoring patients, II. BACKGROUND AND RELATED WORK unless similarities exist between performance during exams Dementia of Alzheimer type. A significant component of the dementia of Alzheimer type (DAT) that accompanies Since the introduction of the MMSE, this test has been Alzheimer's disease (AD) is aphasia, a loss of written and widely used in clinical applications as an aid to diagnosis oral communicative ability [8], [9]. Symptoms of aphasia and in monitoring the progression of the dementia in indi- include breakdowns in semantic processing, shallow vocab- viduals. The exam is also standardly used in the clinical and ularies and word-finding difficulties leading to the deterio- therapeutic research community as a basis for discretizing ration of spontaneous speech [10]. This deterioration begins populations into normal, mild, moderate and severe dementia early in the onset of the disease and is often observed by levels according to the DSM-IV [8]. Less standard, however, family members during conversational situations [4]. Further, is the selection of boundary points in a community setting, recent studies of oral and written spelling have shown marked since performance has been linked to level of education and differences in language ability between AD patients and other issues that may be characteristic of the population. With healthy older adults [11], [12].
that said, "a variety of cutpoints have been suggested over For example, Ronald Reagan, former president of the the years, with 17/18 for clear-cut cases, 21/22, 23/24 and United States, exhibited signs of AD from the outset of his even 25/26" [15].
presidency. Reagan's speeches suffered from word-finding Verbal Picture Descriptions. Verbal picture descriptions difficulties, inappropriate phrases and uncorrected sentences can be used to assess the level of cognitive impairment and that were obvious signs of his deterioration, but the fact that "are among the most sensitive measures for assessing spon- he had AD was not released until 1994 [13].
taneous speech in AD" [10]. In these exams, the patient is Current methods of assessing DAT levels in patients supplied with a simple or complex line drawing that he or she involve structured interviews that attempt to capture the must verbally describe. These narratives are recorded on tape breakdown of communicative capacity by testing specific and later analyzed according to a variety of speech attributes linguistic abilities, including confrontation naming [1], single including articulation, grammar, phrase length, paraphasias, word production [2] or word generation given context [3].
word-finding difficulties, themes and information content.
However, these methods sometimes fail to identify early While simple pictures may be useful in identifying patients symptoms observed by family members during normal con- with moderate deficiencies, more complex drawings may be versation [4], and often fail to describe adequately the level helpful for screening patients with mild dementia [10], [13].
of impairment in low scoring patients, unless similarities exist between performance during exams and in normal III. A LEXICAL APPROACH conversation [5].
Mini-Mental State Exam. The Mini-Mental State Exam Research in the area of automatic dementia detection in (MMSE) is a cognitive grading scale used in the assessment Alzheimer's patients has been quite limited, with few results of patients first described by Folstein et al. [14] in 1975. This found in a search of the literature [6], [16]. Bucks et al. [6] test addressed a need for a relatively short screening exam conducted a small study with 24 individuals: 8 patients and that could be used to reliably identify cognitive impairment 16 healthy controls. The authors collected 8 lexical statistics in a clinical setting. Here, "mini" refers to the fact that over the first 1000 words of spontaneous speech during this exam concentrates only on the cognitive impairment interviews, namely noun (N), pronoun (P), adjective (A) and of mental function and excludes mental deficits covered by verb (V) rates, type-token ratio (TTR), Brun´et's Index (W), comprehensive exams, including mood and abnormal mental Honor´e's Statistic (R) and the Clause-like Semantic Unit functions [14].
(CSU) rate. The results showed that the stylometric attributes The MMSE involves a patient responding to 17 questions had sufficient discriminating power in distinguishing between that cover a wide range of cognitive domains: orientation, the language models of AD sufferers and control subjects.
registration, short-term memory, attention, calculation, visuo- N-rate, P-rate, A-rate and V-rate are the average rate of spatial skills and praxis. Testing of the areas described occurrence for each respective part-of-speech (POS) category.
above is divided into two sections; the first requires verbal These measures capture the lexical distribution of the spoken responses to orientation, memory, and attention questions.
words and were selected heuristically. Bucks et al. found that The second section requires reading and writing and covers AD patients had "higher mean P-rate, A-rate, V-rate scores, ability to name, follow verbal and written commands, write but lower N-rate scores compared with normal older controls" a sentence, and drawing intersecting pentagons. Testing time varies according to impairment level ranging between 5 and The next three statistical attributes were selected to capture 10 minutes and can be administered by clinicians, nurses, the lexical richness of the participant's speech.  is the psychologists, paramedical staff and lay interviewers, with ratio of the total vocabulary to the overall text length limited training.
and is sensitive to the length of text collected. This measure Algorithm 1 Profile dissimilarity(
mountain . .
2: for all n-grams
Honor´e's Statistic Clause-like semantic unit 6: Return F G@H
ATTRIBUTE SET DESCRIBED IN BUCKS ET AL. [6] methods, including language dependencies explicitly built into the model, word segmentation concerns and sparsity of data due to the large vocabulary. Overcoming these obstacles are particularly difficult when dealing with Asian where higher values are associated with a broader vocabulary.
languages such as Chinese or Japanese that do not have is a length insensitive version of  explicit word boundaries. By using byte-level n-grams the calculated using the following equation: authors dramatically reduce the vocabulary, clearly define boundaries between units and do not make use of any language dependent information, including word boundaries, The resulting value typically ranges between 10 and 20, character case, white-space characters or punctuation [21].
with richer speech producing lower values [17]. Honor´e's However, due to their frequency and consistency of use by authors, white-space and punctuation characters implicitly is also insensitive to length and is calculated as play a significant role in classifier performance.
Author models are modeled by CNG profiles that are defined as "a set of the most frequent n-grams with their is the number of words in the vocabulary only normalized frequencies generated from training data" [21] spoken once. Higher values of  indicate a richer vocabulary and, hence, the two parameters of importance to the CNG [18]. The CSU rate is a "measure of semantic cohesion in method are n-gram size and the profile length . Due to phrases . . and characterizes the participant's ability to form the fixed and small vocabulary of ASCII characters used, the noun and verb phrases and gives an indication of the flow of CNG method does not suffer from the sparse data problems speech" [19]. To calculate this value, the corpus must first of word n-gram approaches at low values of . To be sure, be hand-tagged according to a set of 13 rules that identify the work in [21] indicates that values for cohesion boundaries in phrases. The CSU rate is the average employed before computational limitations and performance number of units found per 100 words. Patients suffering from decreases are encountered. This point contrasts with word- dysphasia find it difficult to formulate long phrases leading based approaches which are computationally feasible with to higher CSU rates than in normal speakers, making this up to 3 or 4 [20]. The profile length variable "the most important discriminator between normal the number of n-grams considered during the similarity and dysphasic speech" [16]. Bucks et al. [6] confirmed that calculation and serves to keep profiles small when large AD patients use less rich speech vocabulary according to the are used. Small profile lengths not only improve three lexical richness measures  , computational performance but also reduce model overfitting.
significant differences in CSU rates between AD patients and This was supported by the fact that pruning threshold e controls were not found in the data. Table I gives a summary was shown to improve accuracy with optimal values lying of the attributes detailed above.
n-grams [21].
Common N-Grams (CNG) approach. The Common N- Classification via Common Word Frequencies. Using com- Grams (CNG) approach to authorship attribution uses charac- mon word frequencies as style markers has be studied exten- ter n-grams to model consistencies in author style. Traditional sively by Burrows [22], [23], [24] and further investigated n-gram language models intuitively treat documents as a by Stamatatos et al. [25]. Both of these approaches focused sequence of words and rely on word n-grams to capture on using the most frequent words in a text corpus as consistencies with state-of-the-art performance [20]. How- style markers. The primary difference between these two ever, several difficulties arise when working with word based approaches is the training corpus from which these style Fig. 1. Histogram of MMSE score Fig. 2. Histogram of MMSE-based classes markers were selected. Burrows argues for frequent terms that are selected from the target corpus itself and has shown effective classification results over a wide variety of literature domains [23], [24]. Stamatatos et al. [25] improved on previous results by extracting these style markers from the British National Corpus rather than the target corpus itself.
IV. PROBLEM, DATA AND SOLUTION The research in this paper explores several approaches to the problem of automatically diagnosing the dementia level of Alzheimer's patients through analysis of spontaneous speech Fig. 3. Summary of accuracy on two-class task captured in a transcript. Each of these approaches assume that recognizable language artifacts, which are a function of the dementia level in patients, exist. Further, we are interested a na¨ıve ZeroR rule-based classifier, which predicts the modal in attributes that can be extracted automatically from patient class during training for test instances. Overall, these results transcripts and can be used to reliably and consistently model show that intelligent machine learning approaches performed the dementia level of AD patients.
better on the corpus than the na¨ıve baseline of weighted ACADIE Dataset. The dataset used during analysis and random guessing. This indicates that pairing spontaneous experimentation contains the language spoken by speech data with machine learning techniques is a viable Goal Attainment Scaling interviews between field approach to the task of predicting dementia levels. Further, researchers, Alzheimer patients, and care-givers, compiled the results suggest improvements in classification accuracy within the Atlantic Canada Alzheimer's Disease Investigation are obtained by breaking large lexical categories into its of Expectations (ACADIE) study of donepezil [7]. The smaller constituents by including modifier relationships.
dataset includes two interviews per patient with interviews Figure 3 illustrates the classification accuracies of the conducted at assessment visits 12 weeks apart to examine explored methods on the two class prediction task. In this task the effects of the drugs administered during the interim.
the classification algorithm must label test instances as A)=m Interviews were conducted at six sites across Atlantic Canada scoring on the MMSE scale.
a severe or moderate level of DAT impairment, while nM?+opn MMSE scores are provided with the interview transcripts, indicates that the patient should be placed in the with discretized scores in the ranges 0–15, 16–20, 21–24, and dementia classes. The ZeroR rule-based classifier 25–30, according to [14].
produced a baseline accuracy of for this task. From the other classifiers explored, an accuracy range of UMMARY OF THE RESULTS was observed. On is task, the best accuracy was shared Each of the figures in this section gives the classification , while trailing close behind was performance in terms of maximum accuracy obtained for the ordinal CNG method with an accuracy of each explored approach on a specific classification task.
The second classification task required the algorithm to Importantly, also included in each chart are the results from predict one of four class labels for a test instance: Fig. 4. Summary of accuracy on four-class task Fig. 6. Summary of accuracy on mild/normal task was posted for this task by the ZeroR classifier. The observed accuracy range for the other methods performed the worst here and was only narrowly more accurate than the baseline. The best classification accuracy was achieved by the attribute selection method at . The next closest method in terms of classification accuracy was the other frequent words based method at Fig. 5. Summary of accuracy on severe/normal task The thrust of this work was to examine the potential use of natural language processing and machine learning techniques . The results from this task are in the diagnosis of dementia of Alzheimer type (DAT) in shown in Figure 4. On this task, a baseline accuracy of older adults. Framing this problem as a text classification was set forth by the ZeroR classifier, and a range of task, we present several viable approaches based on mature was observed. The highest accuracy was achieved algorithms and implementations. The main contributions are: by classifiers using the attribute selection method a detailed statistical analysis of the lexical features . The next best classifier was standard CNG with exhibited in the spontaneous speech of older adults with , closely followed by Alzheimer's disease, Figure 5 compares the prediction accuracy for algorithms novel application of several machine learning and natu- on a third task. This task involved predicting class labels for ral language processing techniques in rating DAT, instances from the severe and normal groups only. The na¨ıve a novel classification algorithm in Ordinal CNG, and baseline method produced an accuracy of on this task.
positive results in detecting DAT through an extensive All of the intelligent methods examined in these experiments exploration of classification methods.
produced significantly higher classification accuracies with 1) Lexical analysis: A detailed statistical analysis was . Again, on this task the most conducted on transcripts of spontaneous conversational accurate classifier was built over an attribute set consisting speech collected from Alzheimer's patients. Analysis of of frequent word ratios. Interestingly, both the spontaneous speech has the potential of offering many clues produced the same classification accuracy at to the ties between linguistic ability and the extent of DAT.
. One other approach produced an accuracy above We chose to approach attribute selection from a statistical . A particularly noteworthy standpoint rather than rely on heuristics as in Bucks et al.
observation is that the attribute set beat out [6]. We also believed that the detail of the Connexor part- on this task.
of-speech tagger (POS) should be exploited to narrow the Figure 6 contains results from the mild/normal classifi- lexical categories analyzed. Our experiments confirmed the cation task. This task requires the algorithm to label test validity of our assumptions leading to higher accuracies and groups only. A baseline a better understanding of the data. During our lexical analysis of the data we found that closed class words were particularly [6] R. Bucks, S. Singh, J.M., Cuerden, and G. Wilcock, "Analysis of helpful in predicting the level of language deficit in patients.
spontaneous, conversational speech in dementia of Alzheimer type: Additionally, we found that lexical richness measures were Evaluation of an objective technique for analyzing lexical perfor- mance," Aphasiology, vol. 14, no. 1, pp. 71–91, 2000.
not powerful discriminators for our purposes.
[7] K. Rockwood, J. Graham, and S. Fay, "Goal setting and attainment 2) Novel application: Applying the CNG algorithm, in Alzheimer's disease patients treated with donepezil," Journal of which was originally developed for authorship attribution, Neurology, Neurosurgery and Psychiatry, vol. 73, pp. 500–507, 2002.
[8] A. P. Association, Diagnostic and Statistical Manual of Mental Disor- to our DAT classification problem showed that the algorithm ders, 4th ed., Washington, DC, 1994.
is robust with respect to application. The standard algorithm [9] J. Cummings, F. Benson, M. Hill, and S. Read, "Aphasia in dementia was applied without modification and achieved some of the of the Alzheimer type," Neurology, vol. 35, pp. 394–397, 1985.
[10] K. Forbes, A. Venneri, and M. Shanks, "Distinct patterns of sponta- most accurate results observed. This robustness is due to the neous speech deterioration: an early predictor of Alzheimer's disease," byte-level n-grams used to construct the class profiles.
Brain and Cognition, vol. 48(2-3), pp. 356–61, 2002.
During our lexical analysis of the data we found that [11] S. Pestell, M. Shanks, J. Warrington, and A. Venneri, "Quality of spelling breakdown in Alzheimer's disease is independent of disease closed class words were helpful in predicting the level of progression," Journal of Clinical and Experimental Neuropsychology, language deficit in patients. Naturally, this lead us to examine vol. 22, pp. 599–612, 2000.
in more detail these classes of words to determine if deeper [12] H. Platel, J. Lambert, F. Eustache, B. Cadet, M. Dary, F. Viader, and B. Lechevalier, "Characterstics and evolution of writing impairment in relationships exist between the statistics and the observed Alzheimer's disease," Journal of Clinical and Experimental Neuropsy- effect in patients. Previous research had been done in the chology, vol. 22, pp. 599–612, 1993.
field of text classification where commonly used words were [13] A. Venneri, O. Turnbull, and S. Della Salla, "The taxonomic perspec- tive: the neuropsychological diagnosis of dementia," Revue Europeenne used as style markers. Our experiments showed that the novel de Psychologie Apllique, vol. 46, pp. 81–86, 1996.
approach to detecting deficit and novel application for these [14] M. Folstein, S. Folstein, and P. McHugh, "Mini-mental state. a practical generic text classification algorithms were well suited for method for grading the cognitive state of patients for the clinician," Journal of Psychiatric Research, vol. 12, pp. 189–198, 1975.
each other producing some of the most accurate models.
[15] C. Brayne, "The mini-mental state examination, will we be using it 3) Algorithm extension: In addition to the standard CNG in 2001?" International Journal of Geriatric Psychiatry, vol. 13, pp.
algorithm, an ordinal CNG extension was developed and 285–294, 1998.
[16] D. Holmes and S. Singh, "A stylometric analysis of conversational tested. This algorithm was designed to take advantage of a speech of aphasic patients," Literary and Linguistic Computing, vol. 11, natural ordering of classes, leveraging the training instances pp. 45–60, 1996.
within the extreme groups. Our results showed that classifica- [17] E. Brun´et, "Le vocabulaire de jean giraudoux," Structure et Evolution, tion accuracy was not affected by the exclusion of [18] A. Honor´e, "Some simple measures of richness of vocabulary," As- training instances. This observation leads us to sociation of Literary and Linguistic Computing Bulletin, vol. 7, pp.
believe that our method effectively generates models using 172–177, 1979.
[19] S. Singh, "Computational analysis of conversational speech in dyspha- fewer training instances, but with better discriminating char- sic patients," Ph.D. dissertation, University of the West of England, 4) Positive results: The positive results reported in this [20] F. Peng, D. Schuurmans, V. Keselj, and S. Wang, "Automated author- ship attribution with character level language models," in Proceedings work were arrived at after an extensive exploration of 10th Conference of the European Chapter of the Association for classification methods. This research showed that several Computational Linguistics (EACL 2003), 2003.
standard classification algorithms could be used to produce [21] V. Keselj, F. Peng, N. Cercone, and C. Thomas, "N-gram-based author profiles for authorship attribution," in Proceedings of Pacific classification accuracies significantly higher than our na¨ıve Association for Computational Linguistics (PACLING'03), 2003.
rule-based classifier that always selects the modal class.
[22] J. Burrows, "Word-patterns and story-shapes: The statistical analysis of narrative style," Literary and Linguistic Computing, vol. 2, no. 2, pp. 61–70, 1987.
[23] ——, "Not unless you ask nicely: The interpretative nexus between [1] J. Hodges, D. Salmon, and N. Butters, "The nature of the naming analysis and information," Literary and Linguistic Computing, vol. 7, deficit in Alzheimer's and Huntington's disease," Brain, vol. 114, pp.
no. 2, pp. 91–109, 1992.
1547–1558, 1991.
[24] ——, "‘Delta': a measure of stylistic difference and a guid to likely [2] A. Martin and P. Fedio, "Word production and comprehension in authorship," Literary and Linguistic Computing, vol. 17, no. 3, pp.
Alzheimer's disease: the breakdown of semantic knowledge," Brain 267–287, 2002.
and Language, vol. 35, pp. 394–397, 1983.
[25] E. Stamatatos, N. Fakotakis, and G. Kokkinakis, "Text genre detection [3] L. Phillips, S. D. Sala, and C. Trivelli, "Fluency deficits in patients using common word frequencies," in Proceedings of 18th International with Alzheimer's disease and frontal lobe lesions," European Journal Conference on Computational Linguistics (COLING2000), vol. 2, of Neurology, vol. 3, pp. 102–108, 1996.
2000, pp. 808–814.
[4] C. Crockford and R. Lesser, "Assessing functional communication in aphasia: Clinical utility and time demands of three mehods," European Journal of Disorders of Communication, vol. 29, pp. 165–182, 1994.
[5] S. Sabat, "Language function in Alzheimer's disease: a critical review of selected literature," Language and Communication, vol. 14, pp. 331–

Source: http://vlado.cs.dal.ca/papers/icma05.pdf

Microsoft word - kap_065bearbeitet

Leitlinien der DGN 2008 Diagnostik und Therapie komplexer regionaler Schmerzsyndrome (CRPS) Was gibt es Neues? • Während in der Akutphase eines CRPS peripher-entzündliche Vorgänge vorherrschen, entwickeln sich mit der Dauer der Erkrankung zunehmend neuroplastische Veränderungen im ZNS. Diese Änderung der Pathophysiologie muss in der Therapieplanung berücksichtigt werden.


Virus Adaptation and Treatment open access to scientific and medical research Open Access Full Text Article A paradigm linking herpesvirus immediate-early gene expression apoptosis and myalgic encephalomyelitis chronic fatigue syndrome This article was published in the following Dove Press journal: Virus Adaptation and Treatment21 February 2011Number of times this article has been viewed