Best practices for comparing DuplicateCodeFragment structural equivalence (AST vs. DuplicatesFinder)

Hi everyone,

I’m currently working on a plugin that performs automated method extraction on code duplicates detected via a custom analysis (similar to IntelliJ Deodorant).

I am struggling to find the cleanest way to filter out ‘outlier’ fragments from a set of potential duplicates. My goal is to group fragments that are syntactically identical (allowing for alpha-renaming of variables) to form a new, clean refactoring group.

I have explored PsiEquivalenceUtil.areElementsEquivalent, but it feels too strict because it compares exact literals (e.g., x = 1 vs x = 3 are seen as different).

My questions are:

  1. Is there a recommended way to perform a ‘structural’ comparison that ignores literal values but respects method call targets?

  2. Should I rely on DuplicatesFinder directly for this comparison, or is there a more lightweight utility in the RefactoringUtil / CodeInsight package to determine if two PsiElement[] arrays are ‘refactoring-compatible’?

Any pointers to relevant internal classes or documentation would be greatly appreciated.

Thanks for your help!

It is meant to check factual equals, not a forms of code

There is none, it is a separate huge task in the industry that everyone solves based on their ideas of what result is good for them. As far as I know there is no such thing you want to use that magically would cover duplicates detection for your requirements.

DuplicatesFinder is close, but you must know we have also many false-positives, which are OK for our target applications and we are not going to do anything with them. Not all languages supported, only some of them

Duplicates detection is so big task that there are startups who try to do only that