Critical Data Analysis Precedes Soft Computing Of Medical Data

Diedrich Graf von Keyserlingk, Jan Jantzen, G. Berks, Armgard Gräfin von Keyserlingk, H. Axer

    Research output: Chapter in Book/Report/Conference proceedingArticle in proceedingsResearch


    Medical databases appear in general as collections of scarcely defined, uncomfortable feelings, disturbances and disabilities of patients encoded in medical terms and symptoms, often scarcely enriched with some ordinal and metric data. But, astonishing enough, in many cases this is sufficient for an experienced practitioner to attend successfully. First necessity for a scientific evaluation of such sort of database is to get as much specific information as available from as many patients as possible. This is essential because of the great variability of human diseases. The next step is to extract relevant data from the database addressed to the question of interest and the anticipated answers and models, which assign relevance to data. The data has to be objective, reliable, constant, and independent. The independence is the main topic of this paper. Several methods like analysis of variance, analysis of discriminance, multiple regression and factor analysis offer the possibility to control the interdependence of data.Factor analysis has the advantage that it is based on a mathematical model, and does not ask for normal distribution of the data. Factor analysis describes the correlation of many variables by few independent factors. The number of factors which can be extracted from a correlation matrix is a reliable criterion for inherent independent information in that matrix. Several data sets were analyzed, which were gained from the Aphasia Database, such as different groups of patients, groups of symptoms, and symptoms in time sequence.26 aphasic symptoms (e.g. dysarthria, paraphasias, neologisms, agrammatism, etc.) documented in reports of the 265 aphasic patients provide 3 factors. If the symptoms were graded according to the severity of disability in the individual cases, then 5 factors were extracted. The factors had different relationships (loadings) to the symptoms. Although the factors were gained only by computations, they seemed to express some modular features of the language disturbances. This phenomenon, that factors represent superior aspects of data, is well known in factor analysis. Factor I mediates the overall severity of the disturbance, factor II points to expressive versus comprehensive character of the language disorder, factor III represents the granularity of the phonetic mistakes, factor IV accentuates the patients' awareness of his disease, and factor V exposes the deficits in communication. Sets of symptoms corresponding to the traditional symptoms in Broca and Wernicke aphasia may be represented in the factors, but the factor itself does not represent a syndrome. It is assumed that this kind of data analysis shows a new approach to the understanding of language disturbances, which should represent functional entities of the brain more closely than the traditional clinical descriptions do.
    Original languageEnglish
    Title of host publicationESIT 2000 Final Programme and Proceedings (Abstracts and CD)
    Publication date2000
    Publication statusPublished - 2000
    EventEuropean Symposium on Intelligent Techniques - Aachen, Germany
    Duration: 14 Sep 200015 Sep 2000


    ConferenceEuropean Symposium on Intelligent Techniques

    Cite this