Main

Architecture

Clone Code Detection Tools and Algorithms

Out of my research need, I have collected the list of publicly available tools for "clone detection", that is, the detection of identical or similar code fragments from the given source programs.

Name Supported languages Approach License Usage
Duploc C, C++, Java line-by-line string matching GPL, written in Smalltalk
PHP:Duploc PHP ??? GPL, written in PHP
pmd (cpd) Java Karp-Rabin string matching algorithm BSD-style Use ant task shown on the link (for avoiding file names enumeration)
Simian Java, C#, C, C++, COBOL, Ruby, JSP, ASP, HTML, XML, Visual Basic, or any text files ??? Free for non-commercial and open source use, written in .NET 1.1 or Java 1.4, source code unavailable java -jar .../simian-2.2.4.jar -recurse *.java
SimScan Java works on the parsed source tree (using ANTLR parser) Free for non-commercial and open source use, written in Java, source code unavailable
dupwatch (dupwatch.jar, dupwatch.tgz) Java Finding duplicated code via "metric fingerprints" ???
Here are some links to the tools not publicly available:
  • Code clone related tools summarized by Osaka University, the developer of CloneWarrior/CCFinder/Gemini (currently available only upon request)
  • JPlag: tool for detecting software plagiarism. account available on request
  • Moss: a tool for detecting cheating in university programming classes. Free internet service available only to instructors in programming courses
  • CloneDR (Clone Doctor), famous, but commercial tool. Their paper says its based on AST ignoring identifier names
  • Dup and Pdiff by Brenda S. Baker at Bell lab. Only the pepers are available
  • Dotplot: similarity pattern visualization tool

UML Tools

Eclipse plug-ins for UML handling (not all have been verifed to work)
* Omondo EclipseUML, visual modeling tool, natively integrated with Eclipse and CVS. Free version is available. It depends on EMF (Eclipse Modeling Framework), GEF (Graphical Editor Framework) and UML2.
* Enterprise Architect/MDG link for Eclipse: 30 day trial is available
* Visual Paradigm SDE for Eclipse (SDE-EC): Community Edition is free for non-commercial use, but no reverse engineering feature for the free edition
* Borland Together Edition for Eclipse
* Slime UML
* MagicDraw
* CanyonBlue Konesa

* Rational Software Modeler
* Telelogic TAU Generation2

Other tools
* Class Abstraction Tool by Alexander Egyed of USC

Coding Styles for Open Source Software Projects

* Mozilla coding style guide
* GNU coding standards
* C and C++ Style Guides