Quantcast
Channel: George – UlduzSoft
Viewing all articles
Browse latest Browse all 55

Reverse-engineering the KaraFun file format. Part 4, the encryption

$
0
0

So far the files we have seen had no encryption. However some of our users pointed out there are some files which are encrypted. While the encrypted KFN files were still analyzed and dumped properly, the resulted files were unreadable. Of course the player need to support those files too, so this is something which we need to take care of.

First let me start with a statement that reverse-engineering the encryption is typically a very difficult task. Especially when the encryption keys are unknown, where it requires reverse-engineering the software itself. However as you see below due to a few major flaws in the KaraFun software it is still possible to reverse-engineer even the encrypted files without dealing with the software itself.

So let’s take a look at the encryption using the files available at http://firstvietnamesechurch.com/nhc/download-karafuns. I used the Ai_biet_ngay_nao_Chua_den song in my example below, which is encrypted, but any other song could be used. Let’s dump the directory structure using the dumper program from the Part 2:

Filename: UTF8:Ai Về Sông Tương.docx, type #0, len1 18139, offset 0, len2 18144, flags 1
Filename: Ai Ve Song Tuong (Thong Dat) - Chau Sa.mp3, type #2, len1 5903028, offset 18144, len2 5903040, flags 1
Filename: Ben Tre - 1.jpg, type #3, len1 77166, offset 5921184, len2 77168, flags 1
Filename: Bien Galang1 - 1993 - 1.jpg, type #3, len1 192985, offset 5998352, len2 192992, flags 1
Filename: Cau Bach Ho - Hue 1 - 1.jpg, type #3, len1 88864, offset 6191344, len2 88864, flags 1
Filename: Cheo Thuyen Tren Song Huong - 1.jpg, type #3, len1 65977, offset 6280208, len2 65984, flags 1
Filename: Chiec Xuong 3 La - 1.jpg, type #3, len1 76788, offset 6346192, len2 76800, flags 1
Filename: Cua Thuan An - 1.jpg, type #3, len1 55579, offset 6422992, len2 55584, flags 1
Filename: Dan Trau Gam Co - 1.jpg, type #3, len1 88893, offset 6478576, len2 88896, flags 1
Filename: Darling Habour Sydney - 1.jpg, type #3, len1 1229239, offset 6567472, len2 1229248, flags 1
Filename: Em & Trang Mo - 1.jpg, type #3, len1 20623, offset 7796720, len2 20624, flags 1
Filename: Em Che Quat 01 - 1.jpg, type #3, len1 16905, offset 7817344, len2 16912, flags 1
Filename: Em Nhin Nghieng - 1.jpg, type #3, len1 19615, offset 7834256, len2 19616, flags 1
Filename: Eo.S. - starburst 10 aurora distalis.milk, type #6, len1 12964, offset 7853872, len2 12976, flags 1
Filename: Galang1 - 1.jpg, type #3, len1 235096, offset 7866848, len2 235104, flags 1
Filename: Ghe Tren Song Huong - 1.jpg, type #3, len1 104282, offset 8101952, len2 104288, flags 1
Filename: Hong Duong Festival Flower in shopping center 2007 - 1.jpg, type #3, len1 227821, offset 8206240, len2 227824, flags 1
Filename: Ngan truoc Cau Darling Habour Sydney 22.5.10 - 1.jpg, type #3, len1 1666971, offset 8434064, len2 1666976, flags 1
Filename: UVNBayBuomHep_N.TTF, type #4, len1 66248, offset 10101040, len2 66256, flags 1
Filename: Ai Ve Song Tuong (Thong Dat) - Beat.mp3, type #2, len1 5901353, offset 10167296, len2 5901360, flags 1
Filename: Song.ini, type #1, len1 11722, offset 16068656, len2 11728, flags 1

You obviously see the difference from the examples above: the len1 and len2 are different, and the flags value is 1. All the files are scrambled. To understand how the KaraFun player knows which file is scrambled, we need to check the next song, An Nan:

Filename: end.jpg, type #3, len1 65354, offset 0, len2 65354, flags 0
Filename: begin.jpg, type #3, len1 67380, offset 65354, len2 67380, flags 0
Filename: bg 1.jpg, type #3, len1 68210, offset 132734, len2 68210, flags 0
Filename: bg 2.jpg, type #3, len1 77521, offset 200944, len2 77521, flags 0
Filename: bg 4.jpg, type #3, len1 57592, offset 278465, len2 57592, flags 0
Filename: bg 5.jpg, type #3, len1 82363, offset 336057, len2 82363, flags 0
Filename: bg 6.jpg, type #3, len1 50473, offset 418420, len2 50473, flags 0
Filename: bg 7.jpg, type #3, len1 44052, offset 468893, len2 44052, flags 0
Filename: bg 8.jpg, type #3, len1 69982, offset 512945, len2 69982, flags 0
Filename: An Nan (100 0) - ST - melody - 64 t.mp3, type #2, len1 2574172, offset 582927, len2 2574172, flags 0
Filename: An Nan (100 0) - ST - 128 t.mp3, type #2, len1 5062944, offset 3157099, len2 5062944, flags 0
Filename: y2kboogie.ttf, type #4, len1 43124, offset 8220043, len2 43124, flags 0
Filename: Song.ini, type #1, len1 9524, offset 8263167, len2 9536, flags 1

After dumping the second song we check the files and see that all the files are unscrambled while Song.ini is encrypted. It is clear now that the flags value indicates whether the file is scrambled or not.

Now we need to figure out the following things:

  • Whether the file is packed, encrypted or both packed and encrypted;
  • If packed, which packer is being used;
  • If encrypted, which encryption is being used and with which key.

To do the primary analysis we look at the following lines from the both outputs:

Filename: Song.ini, type #1, len1 11722, offset 16068656, len2 11728, flags 1
Filename: Song.ini, type #1, len1 9524, offset 8263167, len2 9536, flags 1

Song.ini is a text file which, if packed, should be much smaller as the result. The same is with the TTF font file, which typically packs very well.  However the size changes is so minimal that it is safe to assume for now that the file is not packed, and just encrypted. Therefore we need to figure out the encryption scheme and the key. The encryption scheme could be anything from XORing the content with a single byte (which could be guessed just by looking at the file) to a complex symmetric cipher using the embedded key (which would require reverse-engineering the binary).

To start, we take a look at the first 16 bytes of the JPEG files. We choose the JPEG files since they have the same signature in the first 4-8 bytes which typically looks like that:

00000000 ff d8 ff e0 00 10 4a 46 49 46 00 01 01 01 00 48 |......JFIF.....H|

So we run the hexdump on all the jpg files in the directory of the unpacked Ai_biet_ngay_nao_Chua_den:

> for i in *.jpg; do hexdump -C "$i" | head -n 1; done
00000000 bd e5 91 9a cb 27 72 46 b1 fc b1 8a a9 c8 0c c5 |.....'rF........|
00000000 bd e5 91 9a cb 27 72 46 b1 fc b1 8a a9 c8 0c c5 |.....'rF........|
00000000 bd e5 91 9a cb 27 72 46 b1 fc b1 8a a9 c8 0c c5 |.....'rF........|
00000000 bd e5 91 9a cb 27 72 46 b1 fc b1 8a a9 c8 0c c5 |.....'rF........|
00000000 bd e5 91 9a cb 27 72 46 b1 fc b1 8a a9 c8 0c c5 |.....'rF........|
00000000 bd e5 91 9a cb 27 72 46 b1 fc b1 8a a9 c8 0c c5 |.....'rF........|
00000000 bd e5 91 9a cb 27 72 46 b1 fc b1 8a a9 c8 0c c5 |.....'rF........|
00000000 bd e5 91 9a cb 27 72 46 b1 fc b1 8a a9 c8 0c c5 |.....'rF........|
00000000 bd e5 91 9a cb 27 72 46 b1 fc b1 8a a9 c8 0c c5 |.....'rF........|
00000000 bd e5 91 9a cb 27 72 46 b1 fc b1 8a a9 c8 0c c5 |.....'rF........|
00000000 bd e5 91 9a cb 27 72 46 b1 fc b1 8a a9 c8 0c c5 |.....'rF........|
00000000 bd e5 91 9a cb 27 72 46 b1 fc b1 8a a9 c8 0c c5 |.....'rF........|
00000000 bd e5 91 9a cb 27 72 46 b1 fc b1 8a a9 c8 0c c5 |.....'rF........|
00000000 bd e5 91 9a cb 27 72 46 b1 fc b1 8a a9 c8 0c c5 |.....'rF........|
00000000 bd e5 91 9a cb 27 72 46 b1 fc b1 8a a9 c8 0c c5 |.....'rF........|

What we can see here?

First, the encryption used is NOT a simple XOR, since the 0xFF at the offset 0 is encoded as 0xBD and the 0xFF at the offset 3 is encoded as 0×91. It may be XOR with the static table using the offset as salt, or a symmetric cypher.

Second, the encryption key is the same for every encrypted file, and no file-specific salt is used, since all the JPEG files start with the same signature.

Now let’s take a look at the font file UVNBayBuomHep_N.TTF (I only show the relevant part):

> hexdump -C UVNBayBuomHep_N.TTF | head -n 80 | tail
00000490 0e fb b9 60 33 06 27 dc d9 0e f8 23 ae 23 80 f2 |...`3.'....#.#..|
*
000004b0 f1 52 7f 51 1b 15 3f 02 92 93 6f 11 5c 67 ac 4a |.R.Q..?...o.\g.J|
000004c0 0e fb b9 60 33 06 27 dc d9 0e f8 23 ae 23 80 f2 |...`3.'....#.#..|
*
000004e0 37 c9 fb 30 b8 76 ac 9b dd 40 70 4b d7 01 cf 41 |7..0.v...@pK...A|
000004f0 0e fb b9 60 33 06 27 dc d9 0e f8 23 ae 23 80 f2 |...`3.'....#.#..|
*
000006a0 08 e0 f2 d9 0f a7 7f 82 31 6e 6a 42 5d 5b 08 7f |........1njB][..|
000006b0 76 74 de 72 62 b1 20 8d fc 8c a7 c0 0d fe 01 56 |vt.rb. ........V|

As you use, the same pattern repeats between 0x4F0 and 0x6A0. From my past experience I'd speculate those areas are filled with zeros, but it would be great to see what's there in the original file. Could we?

In fact, we can do it. This TTF font file has an unique file name, and it is doubtful the church actually created it. This means it must be available somewhere, and indeed first Google search indeed found the file, so we can dump the same section and see:

> hexdump -C ~/download/UVNBayBuomHep_R.TTF | head -n 80 | tail
00000490 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
*
000004b0 00 00 00 c6 00 00 00 00 00 00 00 00 00 c7 00 c8 |................|
000004c0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
*
000004e0 00 00 00 00 00 00 00 c9 00 00 00 00 00 00 00 00 |................|
000004f0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
*
000006a0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ca |................|
000006b0 00 cb 00 00 00 cc 00 00 00 00 00 00 00 00 00 00 |................|

Zeros as expected, but there is much more information here. You can see that all the 16-byte blocks containing all zeros are encoded into the same patterns, while the blocks having a single nonzero value such as the one at 0x4E0 are encoded in a completely different pattern. However the following block of zeros is encoded again the same way as before. This allows us to make a few very important conclusions:

  • The encryption used is a symmetric cipher such as blowfish, AES, DES, RC2, Camellia etc.
  • The encryption uses 16-byte (128 bit) blocks. We can further confirm it by looking at the difference between len1 and len2. For the block ciphers to work properly the file content must be padded to match the block size. As we can see, all the len2 values are divisible by 16 while len1 values aren't. Therefore we can assume that for encrypted files we need to read the len2 bytes for decrypting, but the file size on disk should be len1, with the extra padding bytes thrown out.
  • The encryption is used in the ECB mode http://en.wikipedia.org/wiki/Block_cipher_modes_of_operation, where each block is encoded separately. Hence the content of one block has no effect on the other blocks, and all 16-byte blocks with the same content will be encoded in the same way. This is beneficial for the in-file storage formats because it allows random access to the content, but this is generally a weak mode for the encryption. This was a fatal flaw which allowed reversing the encryption easily.
  • Combining the 128-bit block size with the ECB mode we can limit ourselves to a smaller number of algorithms which have the built-in ecb-128 modes. Only AES and Camellia support those out-of-box, and my bet is on AES.

Now we need to find whether the key is stored in the KFN file or in the KaraFun executable. Hopefully it is stored in the file, and we would find out how it is stored and where. Otherwise, if it is stored in the executable itself (meaning all the KFN files are encrypted with the same key) it would be more difficult.

So for the simplicity sake we make the following assumptions: the file is encrypted by AES algorithm in the ECB mode using the 128-bit blocks and the 128-bit key. And the key is stored somewhere in the file.

First we find out which bytes we need to match with the key:

> hexdump -n 8 -C ~/download/UVNBayBuomHep_R.TTF > out.match

Once decrypted properly, the first 8 bytes from the font file should match those bytes. To speed up the decryption, let's copy the first 64 bytes from the encrypted font file:

> dd if=UVNBayBuomHep_N.TTF of=encrypted.dat bs=1 count=64
64+0 records in
64+0 records out
64 bytes (64 B) copied, 0.000496502 s, 129 kB/s

Now we can create a script which tries to find the key:

# The 116081787 is size of the KFN file
i=0
while [ $i -lt 16081787 ]; do
 # Generate the key by hexdumping the 16 bytes of the file on a specific offset $i
 key=`hexdump -s $i -n 16 -e '1/1 "%02X"' ai.kfn`
 # Try to decrypt with OpenSSL using AES-128-ECB. Ignore OpenSSL decrypt errors.
 openssl enc -in encrypted.dat -out test.dat -d -aes-128-ecb -K "$key" > /dev/null 2>&1
 # Hexdump the first 8 bytes of the result and compare it with what's expected
 hexdump -C -n 8 test.dat > out.hex
 diff=`diff out.hex out.match`
 if [ -z "$diff" ]; then
   echo "Key $key found at offset $i"
   exit
 fi
 i=$[$i + 1]
done

As you see, the script simply tries to use the 16 bytes from the file as the AES key, starting first from the offset 0, then 1 and so on. Brute-force in action. Let’s run it:

> sh bruteforce.sh
Key 7D64DEA5E1BE5DD4FC7ED23F78C8D8DF found at offset 76

So basically all our assumptions were true – the file key is inside the file, and it uses AES ECB with the 128 bit blocks. Let’s try it with some other file, such as Song.ini:

openssl enc -in Song.ini -out Song.out -d -aes-128-ecb \
    -K 7D64DEA5E1BE5DD4FC7ED23F78C8D8DF

and check the Song.out which seems to be fine. So we can conclude the key is correct and will work on all the files.

The only question remains is whether the key is always stored at offset 76 (which is 0x4C). To see it we hexdump the file around this offset:

> hexdump -C ai.kfn | head
00000000 4b 46 4e 42 44 49 46 4d 01 03 00 00 00 44 49 46 |KFNBDIFM.....DIF|
00000010 57 01 03 00 00 00 47 4e 52 45 01 ff ff ff ff 53 |W.....GNRE.....S|
00000020 46 54 56 01 53 00 12 01 4d 55 53 4c 01 71 01 00 |FTV.S...MUSL.q..|
00000030 00 41 4e 4d 45 01 0d 00 00 00 54 59 50 45 01 00 |.ANME.....TYPE..|
00000040 00 00 00 46 4c 49 44 02 10 00 00 00 7d 64 de a5 |...FLID.....}d..|
00000050 e1 be 5d d4 fc 7e d2 3f 78 c8 d8 df 54 49 54 4c |..]..~.?x...TITL|
00000060 02 1f 00 00 00 41 69 20 56 e1 bb 81 20 53 c3 b4 |.....Ai V... S..|
00000070 6e 67 20 54 c6 b0 c6 a1 6e 67 20 28 43 6c 61 73 |ng T....ng (Clas|
00000080 73 69 63 29 41 52 54 53 02 08 00 00 00 43 68 c3 |sic)ARTS.....Ch.|
00000090 a2 75 20 53 61 41 4c 42 4d 02 0d 00 00 00 4b 66 |.u SaALBM.....Kf|

Here is our key, starting from 0x4C. It is all clear now – the key is stored in the file header under the FLID. So the loader logic should be the following:

  • Parse FLID and get the key (the non-encypted files tend to have zeros there);
  • If the file flags are nonzero, read the padded length of a file, and decrypt it with AES-ECB-128 using the key from FLID
  • Write the decrypted data to disk up to real length of the file, and discard the remainder

Knowing that we can modify our Java unpacker to add the encryption support:

import java.io.File;
import java.io.IOException;
import java.io.FileOutputStream;
import java.io.FileNotFoundException;
import java.io.RandomAccessFile;
import java.nio.ByteBuffer;
import java.nio.charset.Charset;
import java.util.ArrayList;
import java.util.List;
import javax.crypto.Cipher;
import javax.crypto.spec.SecretKeySpec;
import java.security.GeneralSecurityException;
import javax.crypto.spec.IvParameterSpec;

class KFNDumper
{
	public static final int TYPE_SONGTEXT = 1;
	public static final int TYPE_MUSIC = 2;
	public static final int TYPE_IMAGE = 3;
	public static final int TYPE_FONT = 4;
	public static final int TYPE_VIDEO = 5;

	// The KFN file
	private RandomAccessFile m_file = null;

	// The file decryptor, if known
	private Cipher m_decryptor = null;

	// A directory entry
	class Entry
	{
		public int type;		// the file type; see TYPE_
		public String filename;	// the original file name in the original encoding
		public int length_in;	// the file length in the KFN file
		public int length_out;	// the file lenght on disk; if the file is encrypted it is the same or smaller than length_in
		public int offset;		// the file offset in the KFN file starting from the directory end
		public int flags;		// the file flags; 0 means "not encrypted", 1 means "encrypted"
	};

	public KFNDumper( String fontFilename ) throws IOException
	{
		m_file = new RandomAccessFile( fontFilename, "r" );
	}

	public List<Entry> list() throws IOException, GeneralSecurityException
	{
		List<Entry> files = new ArrayList<Entry> ();

		// Read the file signature
		String signature = new String( readBytes(4) );

		if ( !signature.equals("KFNB") )
			return new ArrayList<Entry> ();

		// Parse the header fields
		while ( true )
		{
			signature = new String( readBytes(4) );
			int type = readByte();
			int len_or_value = readDword();
			byte[] buf = null;

			switch ( type )
			{
				case 1:
					break;

				case 2:
					buf = readBytes( len_or_value );
					break;
			}

			// Store the AES key if we have it
			if ( signature.equals("FLID") && buf != null )
			{
				SecretKeySpec keyspec = new SecretKeySpec( buf, "AES" );
 				m_decryptor = Cipher.getInstance("AES/ECB/NoPadding");
				m_decryptor.init( Cipher.DECRYPT_MODE, keyspec );
			}

			if ( signature.equals("ENDH") )
				break;
		}

		// Read the number of files in the directory
		int numFiles = readDword();

		// Parse the directory
		for ( int i = 0; i < numFiles; i++ )
		{
			Entry entry = new Entry();

			int filenameLen = readDword();
			byte[] filename = readBytes( filenameLen );

			// This is definitely not correct as the native encoding is used, but that's the best we can come out with
			entry.filename = Charset.forName( "UTF-8" ).decode( ByteBuffer.wrap( filename ) ).toString();

			entry.type = readDword();
			entry.length_out = readDword();
			entry.offset = readDword();
			entry.length_in = readDword();
			entry.flags = readDword();

			files.add( entry );
		}

		// Since all the offsets are based on the end of directory, readjust them
		for ( int i = 0; i < files.size(); i++ )
			files.get(i).offset += m_file.getFilePointer();

		return files;
	}

	public void extract( final Entry entry, String outfilename ) throws IOException, GeneralSecurityException
	{
		// Prepare the decryptor if we have it
		if ( (entry.flags & 0x01) != 0 && m_decryptor == null )
			throw new IOException("Key is unknown");

		// Seek to the file beginning
		m_file.seek( entry.offset );

		// Create the output file
		FileOutputStream output = new FileOutputStream( outfilename );

		byte[] buffer = new byte[8192];	// size of the buffer must be a multiple of 16
		int total = 0;

		while ( total < entry.length_in )
		{
			int toRead = buffer.length;

			if ( toRead > entry.length_in - total )
				toRead = entry.length_in - total;

			int bytesRead = m_file.read( buffer, 0, toRead );

			if ( (entry.flags & 0x01) != 0 )
			{
 				byte [] decrypted = m_decryptor.doFinal( buffer );

				// We might need to write less than we read since the file is rounded to 16 bytes
				int toWrite = bytesRead;

				if ( total + toWrite > entry.length_out )
					toWrite = entry.length_out - total;

				output.write( decrypted, 0, toWrite );
			}
			else
				output.write( buffer, 0, bytesRead );

			total += bytesRead;
		}

		output.close();
	}

	// Helper I/O functions
	private int readByte() throws IOException
	{
		return m_file.read() & 0xFF;
	}

	private int readDword() throws IOException
	{
		int b1 = readByte();
		int b2 = readByte();
		int b3 = readByte();
		int b4 = readByte();

		return b4 << 24 | b3 << 16 | b2 << 8 | b1;
	}

	private byte [] readBytes( int length ) throws IOException
	{
		byte [] array = new byte [ length ];

		if ( m_file.read( array ) != length )
			throw new IOException();

		return array;
	}

    public static void main( String [] args ) throws Exception
    {
		if ( args.length == 0 )
		{
			System.out.println( "Usage: app <KFN file>\n" );
			return;
		}

		KFNDumper kfnfile = new KFNDumper( args[0] );
		List<Entry> entries = kfnfile.list();

		for ( Entry entry : entries )
		{
			System.out.println( "File " + entry.filename + ", type: " + entry.type 
                                     + ", length_in: " + entry.length_in + ", length_out: " 
                                     + entry.length_out + ", offset: " + entry.offset + ", flags: " + entry.flags  );
			kfnfile.extract( entry, entry.filename );
		}
    }
}

That’s it.

Update: one of my readers, who has purchased the paid version of KaraFun Studio, told me this program can save the KFN files in two modes, locked and unlocked. I have asked him to create a sample project with exactly the same content, save it as locked and as unlocked, and provide me with the projects. After analyzing the files, it seems the “locked” project only differs by the Song.ini file being encrypted, while in the unlocked project the Song.ini is stored as cleartext. So now we also know how those encrypted files are created.


Viewing all articles
Browse latest Browse all 55

Trending Articles