Modulo matters

Debashish Sharma

May 17, 2015 • 7 min read

Going back to school days, recall a simple formula
Dividend=Divisor $ times $ Quotient + Remainder

It may be of little value to us now but of course it was quite costly in the good old school days as problems related to it were priced at 5 or 6 marks in the Maths exams! Coming to the point, this formula is actually quite well known in Mathematics as the following:

Division Algorithm: Given integers $ a $ and $ n $ with $ nneq 0 $, there exist unique integers $q$ and $r$ such that $ a=nq+r $, where $0<r<|n|$.

Thus $ a $ leaves remainder $ r $ when divided by $ n $. A lot of interesting concepts have come up from the study of remainders. Such study involving remainders is done in modular arithmetic. We write `$ aequiv b $ mod $ n $’ (read as “$ a $ is congruent to $ b $ modulo $n$”) whenever $ a-b $ is divisible by $ n $ i.e. whenever $ a $ and $ b $ leave the same remainder when divide by $n$. Thus, if $r$ is the remainder when $a$ is divided by $n$, then $aequiv r$ mod $n$. This is also written as $a text{ mod }n=r$. We can always take `$a$ mod $n$’ as the least non-negative remainder (the $r$ of division algorithm) when $a$ is divided by $n$. So all possible values of $a$ mod $n$ are $0,1,2,ldots,n-1$. We then proceed to define the addition and multiplication of integers modulo $n$ as:

$a+_n b=$ Least non-negative remainder when $a+b$ is divided by $n$.

$atimes_n b=$ Least non-negative remainder when $atimes b$ is divided by $n$.

Thus, $4+_3 6=1$ and $8times_5 6=3$. Modular arithmetic is actually an abstraction of a short cut method of counting that we often use. For example, if it is $11:00$ a.m. now then what will be the time after $243$ hours? Clearly that will be $2:00$ p.m. but interestingly, to arrive at the answer we don’t need to start at $11:00$ a.m and count off all the 243 hours. Instead we observe that the same time repeats every 12 hours and $243=24times 10+3$. So its enough to add three hours to $11:00$ a.m. to answer the question. So, if our clock were not working since 26 hours, we must advance it by 2 hours on getting it repaired and not by 26 hours !

Modular arithmetic is often used in assigning an extra digit to identification numbers and codes for detecting forgery or errors. Let us start from a simple example – Students in a certain class have roll numbers from 001 to 100. We re-number them by adding an extra digit $r=$ (Roll number) mod 5. Then the numbers become 0011, 0022, 0033, 0044, 0050, 0061,…, 1000 etc. It is to be noted that 0125 or 0562 etc cannot be the roll number of any of the students in that class !

The United States Postal Service money orders have identification number consisting of 10 digits together with an extra digit called check, which is equal to the 10-digit number modulo 9. Thus the number 8961672109 has the check digit 4. If by mistake the number is entered into a computer (programmed to calculate the check digit) as 89626721094 (error in the fourth position), the machine would calculate the check as 5 and the error would be detected. But this way of assigning check digits may not detect all errors. For example, in the roll numbers re-numbering system mentioned above, if someone enters $0833$ instead of $0383$, then the error (transposition of adjacent digits) would escape undetected. Of course there are better ways of assigning check digits. One such method is used in assigning Universal Product Code (UPC) to most of the retail items. A UPC identification number has $12$ digits. The first six digits identify the manufacturer, the next five identify the product and the last is a check.

Universal Product Code (Image Source : https://www.carolinabarcode.com/upc-barcode-introduction-a-89.html )

The check digit generation involves the dot product calculation of $n$ tuples i.e. $(a_1,a_2,ldots,a_n).(w_1,w_2,ldots,w_n)=a_1w_1+a_2w_2+ldots+a_nw_n$. For an item with UPC number $a_1a_2ldots a_{12}$, the check digit $a_{12}$ should be such that the dot product $(a_1,a_2,ldots,a_{12}).(3,1,3,1,ldots,3,1)=3a_1+a_2+3a_3+a_4+ldots+3a_{11}+a_12$ is divisible by 10 i.e.

$(a_1,a_2,ldots,a_{12}).(3,1,3,1,ldots,3,1)$mod $10=0$

In other words, $a_{12}$ is the additive inverse of $(3a_1+a_2+3a_3+ldots+3a_{11})$mod 10. The fixed $n$-tuple of 3’s and 1’s used in the calculation is called the weighing vector. This scheme detects all single-digit errors and nearly all errors involving transposition of adjacent digits. We observe that a transposition error of the form $(a_1a_2ldots a_{i}a_{i+1}ldots a_{12})rightarrow (a_1a_2ldots a_{i+1}a_{i}ldots a_{12})$ will go undetected if and only if
$(a_1ldots a_{i}a_{i+1}ldots a_{12}).(3,1,ldots,3,1) equiv (a_1ldots a_{i+1}a_{i}ldots a_{12}).(3,1,ldots,3,1) text{ mod } 10$

$Rightarrow (a_i+3a_{i+1})equiv (3a_i+a_{i+1})text{ mod } 10 text{ or } (3a_i+a_{i+1})equiv (a_i+3a_{i+1})text{ mod } 10$

depending on whether $i$ is odd or even. In both cases we have $2(a_{i+1}-a_i)$mod 10 = 0, which is possible only when $|a_{i+1}-a_i|=5$. Thus the only undetected transposition errors of adjacent digits $a$ and $b$ occur when $|a-b|=5$.

An extension of UPC is EAN. Originally it meant European Article number, but now it has been renamed as International Article Number, even though the abbreviation EAN has been retained. EAN is a 13-digit (12 data and 1 check) number. To calculate the check digit, the weighing vector is taken as $(1,3,1,3,ldots,1,3)$.

Identification numbers printed on bank cheques consist of an eight-digit number $a_1a_2ldots a_8$ along with a check digit $a_9$, satisfying the criteria $$(a_1,a_2,a_3,a_4,a_5,a_6,a_7,a_8,a_9).(7,3,9,7,3,9,7,3,9)text{ mod } 10 = 0$$
This method detects all single digit errors and all errors involving the transposition of adjacent digits $a$ snd $b$ except when $|a-b|=5$. Together with this, it also detects most of the errors of the form $ldots abcldots rightarrow ldots cbaldots$, whereas the UPC method detects no such errors.

The International Standard Book Number (ISBN) is a ten digit number having the property $(a_1,a_2,ldots,a_{10}).(10,9,8,7,6,5,4,3,2,1)$ mod 11 = 0. The digit $a_{10}$ is the check digit. For instance, the ISBN of the book Contemporary Abstract Algebra $[1]$ is 81-7319-269-3. The dot product comes out to be 264 which is in fact 0 modulo 11. The curious reader may try writing simple computer programs to check the correctness of ISBNs of various books !

One more thing that we are probably aware of is the IMEI number. If your mobile is with you now, then dial `$*#06#$’ and immediately two 15-digit numbers will come up on your mobile screen. These are the IMEI numbers for your mobile only. The International Mobile Station Equipment Identity(IMEI) number is used to identify GSM, UMTS, LTE and iDEN mobile phones. The IMEI (14 decimal digits plus a check digit) includes information on the origin, model, and serial number of the device. The model and origin comprise the initial 8-digit portion of the IMEI, known as the Type Allocation Code (TAC). The remainder of the IMEI is manufacturer-defined, with a Luhn check digit at the end. The check digit is calculated using the Luhn algorithm. To explain the Luhn algorithm, I take the IMEI number of my handset : 35286505554894 (the first 14 digits). Now starting from the right, we double every digit in odd place to get $3(10)2(16)6(10)0(10)5(10)4(16)9(8)$ and now we take the sum of the digits ie $3+(1+0)+2+(1+6)+6+(1+0)+0+(1+0)+5+(1+0)+4+(1+6)+9+(8)=55$. The check digit is the additive inverse if this sum modulo 10. So, here it’s 5. Thus my 15-digit IMEI number is 352865055548945.

Modular arithmetic is thus quite useful. Of course there are various other ways of generating codes. In India, the recently introduced unique identity number, given in Aadhaar Card, has a trailing 12th digit that is calculated with the Verhoeff algorithm, which uses the Dihedral groups. It was developed by Dutch mathematician Jacobus Verhoeff in 1969. India has a large population and nearly a billion IDs have to be issued by the Unique Identification Authority of India and so it has to be ensured that the check digit scheme used should absolutely minimize data-entry errors to reduce server load, customer service load, and overall aggravation. Hence, the Verhoeff scheme is ideal for this purpose as it catches all single errors and all adjacent transpositions. It also detects more than $95%$ of twin errors and more than $94%$ jump transpositions.

There are some branches of Algebra dealing with the theory of codes like Coding theory and Cryptography, and I am sure that would be highly interesting and surely a promising area of research. The interested reader is advised to read the books [1] and [6] for further information.

Sources:

J.A.Gallian; Contemporary Abstract Algebra; Fourth Edition (Indian); Narosa Publishing House.
http://en.wikipedia.org/wiki/Universal$_ $Product$_$Code
http://www.morovia.com/education/symbology/ean-13.asp
http://en.wikipedia.org/wiki/IMEI
http://uidai.gov.in/UID$_$PDF/Working$_$Papers/A$_$UID$_$Numbering$_$Scheme.pdf
http://www.cs.utsa.edu/~wagner/lawsbookcolor/laws.pdf
https://www.carolinabarcode.com/upc-barcode-introduction-a-89.html
http://imei-number.com/check-digit-calculation/

This article first appeared in the 56th Edition of Purbashree, the annual magazine of Gurucharan College, Silchar. The author has made some minor changes in the article and posted it to Gonit Sora.