Towards Clone Detection in UML Domain Models

Harald Störrle

    Research output: Contribution to journalJournal articleResearchpeer-review

    538 Downloads (Pure)

    Abstract

    Code clones (i.e., duplicate fragments of code) have been studied for long, and there is strong evidence that they are a major source of software faults. Anecdotal evidence suggests that this phenomenon occurs similarly in models, suggesting that model clones are as detrimental to model quality as they are to code quality. However, programming language code and visual models have significant differences that make it difficult to directly transfer notions and algorithms developed in the code clone arena to model clones. In this article, we develop and propose a definition of the notion of “model clone” based on the thorough analysis of practical scenarios. We propose a formal definition of model clones, specify a clone detection algorithm for UML domain models, and implement it prototypically. We investigate different similarity heuristics to be used in the algorithm, and report the performance of our approach. While we believe that our approach advances the state of the art significantly, it is restricted to UML models, its results leave room for improvements, and there is no validation by field studies.
    Original languageEnglish
    JournalSoftware and Systems Modeling
    Volume12
    Issue number2
    Pages (from-to)307-329
    ISSN1619-1366
    DOIs
    Publication statusPublished - 2013

    Bibliographical note

    The original publication is available at www.springerlink.com

    Keywords

    • Model maintenance
    • Model clones
    • Model similarity
    • Model evolution
    • Model management

    Cite this

    Störrle, Harald. / Towards Clone Detection in UML Domain Models. In: Software and Systems Modeling. 2013 ; Vol. 12, No. 2. pp. 307-329.
    @article{80adcd12807140dfaa9c5dfcc9617cc5,
    title = "Towards Clone Detection in UML Domain Models",
    abstract = "Code clones (i.e., duplicate fragments of code) have been studied for long, and there is strong evidence that they are a major source of software faults. Anecdotal evidence suggests that this phenomenon occurs similarly in models, suggesting that model clones are as detrimental to model quality as they are to code quality. However, programming language code and visual models have significant differences that make it difficult to directly transfer notions and algorithms developed in the code clone arena to model clones. In this article, we develop and propose a definition of the notion of “model clone” based on the thorough analysis of practical scenarios. We propose a formal definition of model clones, specify a clone detection algorithm for UML domain models, and implement it prototypically. We investigate different similarity heuristics to be used in the algorithm, and report the performance of our approach. While we believe that our approach advances the state of the art significantly, it is restricted to UML models, its results leave room for improvements, and there is no validation by field studies.",
    keywords = "Model maintenance, Model clones, Model similarity, Model evolution, Model management",
    author = "Harald St{\"o}rrle",
    note = "The original publication is available at www.springerlink.com",
    year = "2013",
    doi = "10.1007/s10270-011-0217-9",
    language = "English",
    volume = "12",
    pages = "307--329",
    journal = "Software and Systems Modeling",
    issn = "1619-1366",
    publisher = "Springer",
    number = "2",

    }

    Towards Clone Detection in UML Domain Models. / Störrle, Harald.

    In: Software and Systems Modeling, Vol. 12, No. 2, 2013, p. 307-329.

    Research output: Contribution to journalJournal articleResearchpeer-review

    TY - JOUR

    T1 - Towards Clone Detection in UML Domain Models

    AU - Störrle, Harald

    N1 - The original publication is available at www.springerlink.com

    PY - 2013

    Y1 - 2013

    N2 - Code clones (i.e., duplicate fragments of code) have been studied for long, and there is strong evidence that they are a major source of software faults. Anecdotal evidence suggests that this phenomenon occurs similarly in models, suggesting that model clones are as detrimental to model quality as they are to code quality. However, programming language code and visual models have significant differences that make it difficult to directly transfer notions and algorithms developed in the code clone arena to model clones. In this article, we develop and propose a definition of the notion of “model clone” based on the thorough analysis of practical scenarios. We propose a formal definition of model clones, specify a clone detection algorithm for UML domain models, and implement it prototypically. We investigate different similarity heuristics to be used in the algorithm, and report the performance of our approach. While we believe that our approach advances the state of the art significantly, it is restricted to UML models, its results leave room for improvements, and there is no validation by field studies.

    AB - Code clones (i.e., duplicate fragments of code) have been studied for long, and there is strong evidence that they are a major source of software faults. Anecdotal evidence suggests that this phenomenon occurs similarly in models, suggesting that model clones are as detrimental to model quality as they are to code quality. However, programming language code and visual models have significant differences that make it difficult to directly transfer notions and algorithms developed in the code clone arena to model clones. In this article, we develop and propose a definition of the notion of “model clone” based on the thorough analysis of practical scenarios. We propose a formal definition of model clones, specify a clone detection algorithm for UML domain models, and implement it prototypically. We investigate different similarity heuristics to be used in the algorithm, and report the performance of our approach. While we believe that our approach advances the state of the art significantly, it is restricted to UML models, its results leave room for improvements, and there is no validation by field studies.

    KW - Model maintenance

    KW - Model clones

    KW - Model similarity

    KW - Model evolution

    KW - Model management

    U2 - 10.1007/s10270-011-0217-9

    DO - 10.1007/s10270-011-0217-9

    M3 - Journal article

    VL - 12

    SP - 307

    EP - 329

    JO - Software and Systems Modeling

    JF - Software and Systems Modeling

    SN - 1619-1366

    IS - 2

    ER -