EMV is the worldwide standard for smart card payments

  • EMV (named after its original developers – Europay, MasterCard and Visa), is now used for over 80% of in person card payments worldwide
  • Initially EMV was a standard for contact smart-cards, but has since expanded to include the closely related contactless payment standards, and payment standards like 3D-Secure (for online payments)
  • Standards are maintained by EMVCo which makes the current version publicly available
  • Over the past 15 years colleagues and I have found numerous security vulnerabilities

EMV is primarily a compatibility standard, not security

  • It is designed to allow terminals and cards work, even without fully understanding the data processed, where cards have very limited RAM and processing power
  • Be conservative in what you do, be liberal in what you accept from others (Postel's law)
  • This has the potential to create security vulnerabilities, by increasing complexity and risking that important data will not be properly interpreted

One way EMV achieves compatibility is through the TLV format

  • Data is encoded as Tag, Length, Value
  • Very efficient, compared to JSON or XML but not easy to read by eye
  • Tree structure can be decoded without knowing all tags
  • Unknown tags can be ignored (for better or worse)
  • 0x00 values are ignored between TLV items, allowing in-place deletion (historically 0xff too)
  • Also known as ASN.1 BER (Basic Encoding Rules) format from X.208, used for example in X.509 HTTPS certificates

It is sometimes helpful to manually decode TLV data

  • Data might be incomplete or corrupt
  • You have to explain why the decoded version is correct
  • You might have to write your own decoder (though I wouldn't recommend it)
  • Doing things for yourself can help you find where others might have slipped up

Follow along yourself at https://murdoch.is/:/emvdecode with notes at https://murdoch.is/:/emvdecodenotes and repository at https://murdoch.is/:/emvdecoderepo

In [1]:
## Some helpful utilities for processing hex data
from hexutils import *
In [2]:
## Convert hex to binary
to_bin('AA')
Out[2]:
'10101010'
In [3]:
## Strip whitespace around and within hex
strip_bytes("303\n  132 ")
Out[3]:
'303132'
In [4]:
## Split hex into bytes for display
split_bytes("303\n  132 ")
Out[4]:
'30 31 32'
In [5]:
## Count how many bytes in a hex string
len_bytes("303\n  132 ")
Out[5]:
3
In [6]:
## Covert bytes to text (using ISO8859-1)
decode_bytes("303\n  132 ")
Out[6]:
'012'
In [7]:
## Format a byte into a binary table
format_bytes('aa')
0xaa =b8b7b6b5b4b3b2b1
10101010
In [8]:
## Split a byte into fields of specified length
format_bytes('aa', [1,2,0,5])
0xaa =b8b7b6b5b4b3b2b1
1-------
-01-----
--------
---01010
In [9]:
## Take a certain number of bytes from a hex string
take("303\n  132 ", 2)
Out[9]:
'30 31'
In [10]:
## Take a certain number of bytes from a hex string with an offset
take("303\n  132 ", 2, 1)
Out[10]:
'31 32'

Output of cardpeek log, requesting EMV record 2 from Short File Identifier (SFI 2)

C:00B2 02 14 00 :6C97:
C:00B2 02 14 97 :9000: 7081948C219F0206...

In [11]:
## Reference control parameter (SFI 2)
format_bytes('14', [5,3])
0x14 =b8b7b6b5b4b3b2b1
00010---
-----100
In [12]:
## Length of record is 0x97 = 151 bytes
0x97
Out[12]:
151
In [13]:
## Response as a Python string
response="7081948C219F02069F03069F1A0295055F2A029A039C019F37049F35019F45029F4C089F34038D0C910A8A0295059F37049F4C089F08020002571352AAAAAAAAAAAA47D15122011407992700000F5F20134D5552444F43482F53544556454E204A2E44525F300202019F1F183134303739303030303030303030303932373030303030309F420208269F4401029F49039F37049F470103"

Response is 7081948C219F02069F03...

In [14]:
## Look at the first byte of the response (a tag)
take(response, 1)
Out[14]:
'70'
In [15]:
## Application class, constructed, one-byte tag
format_bytes(_, [2,1,5]) # 0x70 is a READ RECORD response message template
0x70 =b8b7b6b5b4b3b2b1
01------
--1-----
---10000

Response is 7081948C219F02069F03...

In [16]:
## First byte of length is 0x81...
take(response, 1, 1)
Out[16]:
'81'
In [17]:
## b8 is 1, so the actual length is in the next byte
format_bytes(_)
0x81 =b8b7b6b5b4b3b2b1
10000001
In [18]:
## The actual length is 0x94...
take(response, 1, 2)
Out[18]:
'94'
In [19]:
## which is 148 in decimal
int(_, 16)
Out[19]:
148

Response is 7081948C219F02069F03...

In [20]:
## The tag value is 148 bytes, starting after tag (1 byte) and length (2 bytes)...
take(response, 148, 1+2)
Out[20]:
'8c 21 9f 02 06 9f 03 06 9f 1a 02 95 05 5f 2a 02 9a 03 9c 01 9f 37 04 9f 35 01 9f 45 02 9f 4c 08 9f 34 03 8d 0c 91 0a 8a 02 95 05 9f 37 04 9f 4c 08 9f 08 02 00 02 57 13 52 aa aa aa aa aa aa 47 d1 51 22 01 14 07 99 27 00 00 0f 5f 20 13 4d 55 52 44 4f 43 48 2f 53 54 45 56 45 4e 20 4a 2e 44 52 5f 30 02 02 01 9f 1f 18 31 34 30 37 39 30 30 30 30 30 30 30 30 30 30 39 32 37 30 30 30 30 30 30 9f 42 02 08 26 9f 44 01 02 9f 49 03 9f 37 04 9f 47 01 03'
In [21]:
## which is the whole response from the card
len_bytes(response) - 3
Out[21]:
148

Response is 7081948C219F02069F03...

In [22]:
## The value is constructed so the next byte is a tag
take(response, 1, 1+2)
Out[22]:
'8c'
In [23]:
## Context-specific class, primitive, 1-byte tag
format_bytes(_, [2,1,5]) # 0x8c - CDOL1
0x8c =b8b7b6b5b4b3b2b1
10------
--0-----
---01100

Response is 7081948C219F02069F03...

In [24]:
## Next byte will be the length
take(response, 1, 1+2+1)
Out[24]:
'21'
In [25]:
## b8 is 0 so this is a 1 byte length (0x21)...
format_bytes(_)
0x21 =b8b7b6b5b4b3b2b1
00100001
In [26]:
## which is 16 in decimal
int(_, 16)
Out[26]:
33

Response is 7081948C219F02069F03...

In [27]:
## The CDOL1 is 33 bytes, skipping the tags and lengths 
cdol1 = take(response, 33, 1+2+1+1)
cdol1
Out[27]:
'9f 02 06 9f 03 06 9f 1a 02 95 05 5f 2a 02 9a 03 9c 01 9f 37 04 9f 35 01 9f 45 02 9f 4c 08 9f 34 03'
In [28]:
## After the CDOL1 the next tag is the CDOL2
take(response, 1, 1+2+1+1 + 33) # 0x8d - CDOL2
Out[28]:
'8d'
In [29]:
## with length 0x0c (12)
take(response, 1, 1+2+1+1 + 33 + 1)
Out[29]:
'0c'
In [30]:
## So the CDOL2 can be extracted
cdol2 = take(response, 0x0c, 1+2+1+1 + 33 + 1+1)
cdol2
Out[30]:
'91 0a 8a 02 95 05 9f 37 04 9f 4c 08'

CDOL1 is 9f 02 06 9f 03 06 9f 1a 02 95 05...

In [31]:
## DOL objects are a list of tags and lengths that describe how to
## send data to a card possibly unable to decode TLV data
take(cdol1, 1)
Out[31]:
'9f'
In [32]:
## 9f starts a context-specific class, primitive, multi-byte tag
format_bytes(_, [2,1,5])
0x9f =b8b7b6b5b4b3b2b1
10------
--0-----
---11111
In [33]:
## The next byte of the tag is 0x02
take(cdol1, 1, 1)
Out[33]:
'02'
In [34]:
## 0x02 is the last byte of the tag, giving 0x9f02
format_bytes(_, [1,7]) # 0x9f02 - Amount, Authorised (Numeric)
0x02 =b8b7b6b5b4b3b2b1
0-------
-0000010
In [35]:
## Next is the length of the data expected: 0x06
take(cdol1, 1, 1 + 1)
Out[35]:
'06'
In [36]:
## Going back to the response, another 2-byte tag is at offset 78...
take(response, 2, 78) # 0x5f20 – Cardholder Name
Out[36]:
'5f 20'
In [37]:
## which has length 0x13 (19)
take(response, 1, 80)
Out[37]:
'13'
In [38]:
## This tag is ASCII encoded
take(response, 0x13, 81)
Out[38]:
'4d 55 52 44 4f 43 48 2f 53 54 45 56 45 4e 20 4a 2e 44 52'
In [39]:
decode_bytes(_)
Out[39]:
'MURDOCH/STEVEN J.DR'
In [40]:
## Another 2-byte tag is at offset 100...
take(response, 2, 100) # 0x5f30 – Service Code
Out[40]:
'5f 30'
In [41]:
## with length 0x02
take(response, 1, 102)
Out[41]:
'02'
In [42]:
## and in binary-coded decimal format: 201
strip_bytes(take(response, 2, 103))
Out[42]:
'0201'
In [43]:
## At offset 57 we have a 1-byte tag (with length 59)...
take(response, 1, 57) # 0x57 – Track 2 Equivalent Data
Out[43]:
'57'
In [44]:
## which is also in binary-coded decimal
## I've removed the middle of my card number ;-)
strip_bytes(take(response, 0x13, 59))
Out[44]:
'52aaaaaaaaaaaa47d15122011407992700000f'

TLV decoding properly is very tricky to get right

  • Different encodings are used in different contexts
  • Overflow and underflow errors could easily occur
  • Mistakes do happen and so banks do accept transactions that should in theory be invalid
  • Maybe you would like to do this yourself, whether professionally or just out of curiosity
  • “The only way to understand the wheel is to reinvent it.” — Mike Bond

More on my research – https://murdoch.is/
Research group blog – http://www.benthamsgaze.org/