# Are two words anagrams?

When correcting a single solution takes considerable time.

During the exam of my introductory programming course, I asked
students to formulate a procedure that determines if two words are anagrams,
*i.e.*, they agree in the frequency of their constituting
characters but not their order. One of my students came up with an
original procedure of which I could not immediately assess the
validity.

The algorithm she wrote was:

```
if len(word2) > len(word1):
return False
for c in word1:
if word1.count(c) > word2.count(c):
return False
return True
```

For me this was a novel formulation. And this is how I verified her solution.

## The first `if`

-statement

The first `if`

-statement establishes that

$|w_2| \leq |w_1|.$

This is important, if the two words are to agree in their character frequencies, one cannot be larger than the other. The use of an inequality here is peculiar. Typical solutions of this form check for equality.

## The loop

Next, is a loop that establishes that for each character $\sigma \in w_1$ its less or equally frequent in $w_1$.

$\#(\sigma \in w_1) \leq \#(\sigma \in w_2).\qquad{(1)}$

Again, we notice the use of an inequality where one would expect an equality.

## Combining what we know

Without having explicitly checked for equality of character frequency and length, can we still be certain that we only reach the final return statement if the two words are in fact anagrams? Let us combine the two.

Taking the sum of both sides of eq.Â 1 over the characters in $w_1$, and knowing that we count the occurrence of characters in $w_1$, we obtain

$|w_1| = \sum_{\sigma \in w_1} \#(\sigma \in w_1) \leq \sum_{\sigma \in w_1} \#(\sigma \in w_2) \leq |w_2|.$

The last inequality stems from the fact that $w_2$ may contain characters not in $w_1$. We now have $|w_2| \leq |w_1|$ and $|w_1| \leq |w_2|$, therefore $|w_1| = |w_2|$.

Finally, we note that the inequalities in eq.Â 1 cannot be strict, as this would entail that $|w_1| < |w_2|$.

We thus conclude that $w_1$ and $w_2$ are equally long with the same character frequencies. This is the definition of an anagram.

Well done!