mslib – A library for decoding magnetic stripes

After a number of posts describing how to decode audio streams of magnetic stripes to bits and then characters, I now present a C library that does all that.

The creatively named “mslib” can take a signed 16bit little endian PCM stream of a magnetic stripe and decode it.

My own usage tests have been with 48khz, but it should allow higher frequencies and slightly lower ones.  When changing the frequency from 48khz, it may be required to alter the “peakOffset” and “peakThreshold” values using the functions defined and documented in mslib.h.

The code is written in C and available on github here.  It’s presently released under the GPLv3 and has a single library dependency of glib-2.0. [Edit: No longer has a glib dependency — should be entirely c99 compatible]

The library is tersely documented like most open source code, but comes with an example/test utility.  Decoding a stream consists of the follow steps:

  1. Create the mslib object and load the PCM stream as a signed array of short integers using ms_create().
  2. Create a list of peaks, used for decoding the stream to bits, using ms_peaks_find().
  3. Filter the list of peaks for duplicates using ms_peaks_filter().
  4. Decode the peaks to bits using ms_decode_peaks().  The bits as written on the card may now be fetched as a NULL terminated char array with ms_get_bitStream().
  5. Decode the bits to characters using ms_decode_bits().  The characters as written on the card may now be fetched as a NULL terminated char array with ms_get_charStream().

The library can successfully decode any magnetic stripe to bits (subject to a clean audio signal).  It will decode the bits to characters for streams encoded using the ABA and IATA encoding schemes as found on almost all magnetic stripes.

Development is still continuing, with a top TODO item of removing the glib dependency [Edit: glib has been removed in current builds].  Bug reports, comments and feature requests are welcome and can be left on github’s site.

Some example PCM streams that can be decoded with the utility:

capitalone.pcm – A Visa credit card

wamu.pcm – A Visa debit card

wamu-bad.pcm – An example of a bad swipe of the Visa debit card.

(All credit card numbers are of course no longer valid [not just expired].  All the streams are from track 2 and use ABA encoding)


MagRead is a Qt based GUI application originally written for Maemo 5.

The application is designed to read all types of magnetic stripes that follow the ABA or IATA encoding schemes.  In addition, it attempts to format the data from cards it recognises.

Magnetic stripe data is read over the audio input jack on the mobile phone.  A hardware adapter is required, which primarily consists of something similar to an old cassette tape head.  The start of the application displays a minimalistic welcome screen:

MagRead Start Screen
Start screen for MagRead

At present, credit cards and identification cards are the only cards that will display with their data formatted.

Credit Cards

Visa, MasterCard, American Express and Discover are currently shown:

Visa Card
Capital One Visa Credit Card (inactive number)

If an expired credit card is swiped, the “Expiration Date” field will appear in red:

Debit Card (inactive number)
Washington Mutual Visa Debit Card (inactive number)

AAMVA Driver Licence/Identification Cards

Cards in compliance with AAMVA standards will also be recognised.  These are identification cards and driver licence cards issued by any government based in North America (approx. 22 US states/territories, several Canadian provinces and at least one Mexican state).

When swiped with a track 2 reader, the issuing authority (state/province/agency/territory) is displayed, along with the person’s unique ID number, date of expiration, date of birth, and their present age:

AAMVA Compliant CA DL
California Driver Licence (DL# Redacted)

Expired cards will put their expiration date in red, and include the word in caps “EXPIRED” underneath.  Additionally, if the person’s age is under 18, it will appear in red.  Those over 18 but under 21 will have their age appear in yellow.

Expired California Driver Licence
Expired California Driver Licence (DL# Redacted)

Miscellaneous Cards

Cards which are not identified simply have the data on them displayed:

Ralphs Rewards Card
Ralphs Rewards Card

Partial Reads

On the home screen, there is a “Show Partial Data” checkbox.  When checked, MagRead will display data read off of cards that my contain errors or be incomplete.  By default, only swipes that have passed several parity checks are displayed.

Ralphs Rewards Card -- Red |s indicate characters with bad parity checks
Capital One Visa Card Partial Read

MagRead is still in active development, but will be available for download shortly.  It will be released under the GPL.

At present, it is written in C and C++ and  designed to run on the Nokia N900 with Maemo 5 only.  Work is currently in progress to port the application to standard Qt 4.6, and it may be usable on Symbian devices with Qt support.

Magnetic Stripe IATA Encoding

This post coves the encoding scheme defined by the IATA, originally for use in the airline industry.  This is the encoding found on airline boarding passes, in fact.

It is commonly found on track 1 of magnetic stripes, and track 3 on some cards.  Unlike the ABA encoding, this is alphanumeric.

Character Encoding Overview

  • Each character is 7 bits in length.  Six bits for the character itself and one odd parity bit.
  • The character set includes 64 symbols, ranging from 0x20 – 0x5f on the ASCII table.
  • Encoded characters are offset from ASCII by 32.  E.g., the number 0 is stored as 16 decimal.  Adding 32 gives us 48, the ASCII character 0.
  • Values are encoded in Least Significant Bit (LSB) order.

The encoding follows most of the ABA encoding scheme, but with a larger character set and different ASCII offset.  Calculating the LRC and parity bits however is identical, and the same code can actually be used to decode both ABA and IATA.

Data Formatting

  • Always starts with a percent sign, the start sentinel.
  • Always ends with a question mark, the end sentinel.
  • The field separator is a caret (^).  This typically separates the account number from the name, and the name from the expiration date.
  • A 7bit Longitudinal Redundancy Check character follows the end sentinel.  Like ABA encoding, it uses an even parity bit.  The 7th bit is a odd parity bit of the first 6 bits, just like a normal encoded character.
  • On track 1, can hold up to 79 characters.  On track 3, it can hold up to 101 characters.

For information on how to calculate the LRC, follow the instructions in the ABA encoding post.

Common Data Formatting

  1. Start Sentinel
  2. Format Code.  A single character to define the layout, usually ‘B’
  3. The Primary Account Number (PAN), up to 19 characters in length.
  4. Field Separator
  5. Name of account holder.  Format is “SURNAME/GIVEN NAME M” (middle initial).  Supports up to 26 characters.  Cards vary on how they truncate, some eliminating the / used to separate surname from given.  Not all cards will have a middle initial.
  6. Field Separator
  7. Expiration date in YYMM
  8. Additional data up the remaining space left on the track.
  9. End Sentinel
  10. LRC

Not all cards follow this format, and some even use a different field separator.

Magnetic Stripe ABA Track 2 Encoding

An overview of how data is encoded on track 2 of magnetic stripe cards

This post will cover the encoding scheme defined by the ABA for magnetic stripe cards.  This format is the de facto standard for track 2 of magnetic stripe cards.

Character Encoding Overview

  • Each character is five bits in length.  Four bits for the character itself, and an odd parity bit.
  • The character set includes 16 ASCII symbols, ranging from 0x30 – 0x3f on the ASCII table.  These are numerical digits 0-9 and the characters : ; < = > ?
  • Encoded characters are offset from ASCII by decimal 48 (thus, the numeric digit 0 is stored as 0, and a value of 48 must be added to get the ASCII character 0).
  • Values are stored in binary in Least Significant Bit (LSB) order.  E.g., the number 4 is stored as 001 and not 100.

I’ll go over a few quick examples of this.

The start sentinel is the semi-colon character.  This is ASCII value 59, but we subtract 48 from this.  We then have decimal 11, and in LSB binary: 1101. We then apply the parity bit to the end, which is 0 here as there is an odd number of 1s.  Thus, the encoded character is 11010.

Another example, the number 5.  This is ASCII 53, but becomes the number 5 once 48 is subtracted.  In LSB binary, it is 1010.  There is an even number of 1s, so the parity bit is 1.  The encoded character is 10101.

Data Formatting

  • Always starts with a semi-colon, the “start sentinel”, to indicate beginning of data.
  • Always ends with a question mark, the “end sentinel”, to indicate end of data.
  • The equal sign is used as a field separator.  Traditionally this is put between an account number an expiration date.  Some cards will contain no field separators, others may contain multiple.
  • Only numeric digits should be used to store data — the remaining characters, : < >, are for hardware control purposes.
  • A 5 bit Longitudinal Redundancy Check (LRC) character always follows the end sentinel. It uses an even parity bit scheme, as opposed to the odd parity used for characters.  Its fifth bit is an odd parity of the other 4 bits, and not an LRC parity bit.
  • A total of 40 characters can be stored, including start/end sentinels and field separators.

Longitudinal Redundancy Calculation

The LRC is calculated by looking at all the other encoded characters on the card and calculating an even parity bit.

I’ll go over an example here.  Let’s assume we have the string ;12=34? as data to encode:

11010 ;
10000 1
01000 2
10110 =
11001 3
00100 4
11111 ?
10110 LRC

Just do addition down each column.  If the total is odd, then the LRC parity bit for that position is 1.  If the total is even, it’s a 0. The fifth bit of the LRC is an odd parity of the first four bits. In this case, the LRC is 1011, which is an odd number of bits, making the encoded LRC 10110.  The fully encoded stream is then:


Next I’ll do a quick overview of how data is commonly ordered on a card:

Common Data Ordering

  1. Start Sentinel
  2. Primary Account Number (PAN).  Up to 19 digits in length.  Overflow can be stored in the custom data section.
  3. Field Separator
  4. Expiration date in the format YYMM
  5. Custom data.  Length is up to the remaining space.
  6. End Separator

This format is found on many cards, such as credit cards.  Odds are if you have a magnetic stripe card with an identification number and expiration date, it follows this pattern.

Not all cards will follow this, refer to the data formatting section above for some rules that should be consistently followed.

Decoding the Waveform of a Magnetic Stripe

A method to decode magnetic stripes stored as PCM audio data.

This post is going to cover the method I use for decoding magnetic stripe waveforms recorded using an audio interface.

For in-depth information on how magnetic stipes are encoded, see:

And many other websites which can be found by Google.

First off, an overview of what we’re working with.  I am using raw PCM audio files in the format S16 LE at 48KHz.  I’ve worked with 441.KHz streams reliably, and easily scaled up to 192KHz — overall, 48KHz seems optimal.  A card swipe at 48KHz generally consist of about 10k samples, and are roughly 20kb.  The data on the card is represented using amplitude peaks of alternating polarity.

Here are some screen captures of the waveform:

PCM Waveform of a Magnetic Stripe
PCM Waveform of a Magnetic Stripe
Start Sentinel
Start Sentinel of a Track 2 Swipe

To get the bits from the stream, we have two overall steps:

  1. Find the location of each peak in the stream.
  2. Decode the frequency of the peaks to output 0s and 1s.

To find the locations of the peaks, I follow a simple model.  I take the waveform and an offset of it, and determine where the original waveform and its offset intersect.  Here is an example of an offset waveform:

Offset Waveform
Closeup Example of an Offset Waveform

I then mark each time the two waveforms intersect, ignoring any intersections over a certain threshold.  The threshold is to eliminate intersects that occur around 0 amplitude.  There will typically be more than one intersect found per peak:

Waveform intersects
Intersects of the Two Waveforms Marked by Blue Crosses

In this example, with an amplitude threshold of 750, we have consistently found two intersects per peak.  It is also possible to have situations where the little hump before the actual peak results in several intersects — all of these can be filtered out very simply.

The filtering process I apply is to group the intersects detected by amplitude polarity.  For instance,  in the above picture, the first positive amplitude two blue crosses around sample 1010 would be a group.  The second group would be the two negative amplitude crosses around sample 1075, and so forth.  Once I have a group, I can pick the highest absolute value of amplitude, and reliably assume this is the true peak.

Here is the signal, now filtered:

Filtered Intersects -- 1 Per Peak

I can then move on to the second step, which is to use the frequency of the peaks to decode the stored bits.

A quick overview of how the encoding works:

  • Beginning and end of stream are padded with 0s.
  • Stream is self-clocking.  The padding 0s tell you the clock speed, and you adjust the clock speed as needed while traversing the stream.
  • A zero bit occurs when the distance between peaks is a full cycle.  A one bit appears when there are two peaks in the full cycle.

Going through the list of peaks and determining the difference between them produces a valid bitstream.

Here’s an illistration of the start sentinel above broken into cycles:

Decoded Start Sentinel
Start Sentinel Decoded to Binary Data

As the stream progresses, the clockspeed will reduce significantly, generally starting around 90 samples and reducing to as few as 10 to 15 samples.  Since it happens gradually, you can recalculate your clockspeed after every cycle.

At the end, the card I’m using as an example is decoded to:

00000000000000001101001101000011100100001110011110000 10000100010001010101101011011100100100000010100001000 10011111001011000100100111000001000000010000110000111 11100110000000000000000000000000000000000000000000000 00000000000000000000000000000000000

Ultimately, I’m left with the decoded data:


An Albertsons Supermarket card was used as the example data on this post.  The PCM file used may be downloaded here.