Hooked on Mnemonics Worked for Me

ObfStrReplacer & ExtractSubfile Snippets

ObfStrReplacer is a script that replaces obfuscated variable names with easier to read strings. Some obfuscation techniques rely on common looking strings to make the code difficult to read. For example the string Illl1III111I11 is hard to distinguish from lIll1III111I11. ObfStrReplacer takes a regular expression as an argument to match obfuscated strings, it will then add all matches to a set and replace the matches with a unique string.  11ll1III111I11 would become _carat. All renamed strings start with "_". In the image above we can see the obfuscated code on the left and the de-obfuscated code on the right.

Please see the command line example in the source code for details on usage. I have confirmed it works well on obfuscated ActionScript.  The code blindly replaces matches. It does not check for the reuse of variable names within the scope of different functions. I plan on adding this at a later date. Please leave a VT hash in the comments if you have an example.

ObfStrReplacer Source Code

ExtractSubfile is a simple modification to hachoir subfile's search.py. It is used to extract embedded files. The carving functionality was already included in hachoir-subfile but not exposed.


__@___:~/hachoir-subfile crsenvironscan.xls 
[+] Start search on 126444 bytes (123.5 KB)

[+] File at 0 size=80384 (78.5 KB): Microsoft Office document
[+] File at 2584 size=52039 (50.8 KB): Macromedia Flash data: version 9

[+] End of search -- offset=126444 (123.5 KB)
Total time: 1 sec 478 ms -- global rate: 83.5 KB/sec
__@___:~/$ python ExtractSubFile.py  crsenvironscan.xls 
[+] Start search on 126444 bytes (123.5 KB)

[+] File at 0 size=80384 (78.5 KB): Microsoft Office document => /home/file-0001.doc
[+] File at 2584 size=52039 (50.8 KB): Macromedia Flash data: version 9 => /home/file-0002.swf

[+] End of search -- offset=126444 (123.5 KB)


In the second and third lines at the end of the output we can see a document and SWF were carved.

ExtractSubFile Source Code

Base91 & Angler SWFs

If anyone is curious the encoding that Angler is using in their SWFs is base91. The encoding was hinted at in an excellent article by Palo Alto Networks but was only identified as a function named DecodeToByteArray. Below are my notes to decode and decompress the embedded SWF. 

___*____$ swfextract c34266299460225c0354df5438417924579641095ffd7588a42d8fae07ae8511 
Objects in file c34266299460225c0354df5438417924579641095ffd7588a42d8fae07ae8511:
 [-i] 1 MovieClip: ID(s) 4
 [-F] 1 Font: ID(s) 1
 [-b] 1 Binary: ID(s) 5
 [-f] 1 Frame: ID(s) 0

___*____$ swfextract c34266299460225c0354df5438417924579641095ffd7588a42d8fae07ae8511 -b 5
___*____$ ls
c34266299460225c0354df5438417924579641095ffd7588a42d8fae07ae8511  output.bin  xxxswf.py
 
___*____$ hexdump -C output.bin | head

00000000  40 5a 7a 55 7b 5a 78 30  46 3b 49 26 52 48 43 5d  |@ZzU{Zx0F;I&RHC]|
00000010  40 62 66 40 40 6d 32 7b  59 25 52 5d 75 75 62 55  |@bf@@m2{Y%R]uubU|
00000020  59 4d 53 61 30 34 2a 76  7b 5e 21 74 39 5a 5b 7d  |YMSa04*v{^!t9Z[}|
00000030  62 3f 38 42 3d 5f 51 6b  24 5b 23 3a 50 2c 2c 5e  |b?8B=_Qk$[#:P,,^|
00000040  22 7b 6e 6b 23 69 21 48  2b 35 54 60 24 22 2e 36  |"{nk#i!H+5T`$".6|
00000050  58 6c 75 6d 6d 4c 54 67  48 28 5a 6a 44 4b 30 63  |XlummLTgH(ZjDK0c|
00000060  37 2a 23 3f 53 78 6c 57  4a 67 68 60 48 45 76 67  |7*#?SxlWJgh`HEvg|
00000070  35 2e 79 4a 35 3c 46 6c  5b 47 46 3f 79 42 30 47  |5.yJ5<Fl[GF?yB0G|
00000080  35 6d 3c 67 2c 54 7b 59  42 2b 6a 4f 50 2b 3b 65  |5m<g,T{YB+jOP+;e|
00000090  79 26 26 3c 30 7c 65 59  7a 59 5e 57 22 4b 72 4b  |y&&<0|eYzY^W"KrK|

While reviewing the data I noticed all of the bytes were valid ASCII. This usually infers base64 but the characters '@'' or '$' meant it must be a modified version it. A mistake I made after deobfuscating the ActionScript was I only cursory looked at the decoder. The code and data had the patterns of base64 and I blindly assumed it was. If it was a modified version of base64 I could reconstruct all the chars from the table. This can be done by reading each character from the data into a set. From there I would need to find the right sequence of chars. Strangely, this hackish approach lead me to the encoding.

In [1]: f = open("output.bin", "rb")

In [2]: d = f.read()

In [3]: o = set([])

In [4]: for x in d:
            o.add(x)

In [5]: "".join(sorted(o))
Out[5]: '!"#$%&()*+,./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[]^_`abcdefghijklmnopqrstuvwxyz{|}~'

In [6]: len(o)
Out[6]: 91

91?  As in Base91? Weird, never heard of that. A search lead me to some code written by Adrien Beraud which confirmed the ActionScript is indeed Base91.

After the data is decoded with base91 each byte is XORed. Initially the key is set to a hard coded value then the key becomes the previous byte that was encoded. Once the XOR loop is completed it is decompressed with zlib. The initial XOR key is not static. In PAN's write up the key is 91 and in mine it was 75. The key can be found with a decompiler (Trillix or JPEXS) or a disassembler (swfdump). The later can be done to extract the XOR key from the command line. swfdump -a can be used to get the assembly of the ActionScript. Searching for bitxor and pushint should provide the XOR key.

___*____$ swfdump -a c34266299460225c0354df5438417924579641095ffd7588a42d8fae07ae8511 > asm.as
___*____$ swfdump vi asm.as

        00045) + 2:1 callpropvoid <q>[public]::I1lllIII111I11, 1 params
        00046) + 0:1 pushint 75   <- KEY
        00047) + 1:1 convert_u
        00048) + 1:1 setlocal r4
        00049) + 0:1 pushint 0
        00050) + 1:1 setlocal r5
        00051) + 0:1 label
        00052) + 0:1 getlocal r5
        00053) + 1:1 getlocal r3
        00054) + 2:1 getlocal_0
        00055) + 3:1 getproperty <q>[private]::1Ill1III111I11
        00056) + 3:1 getproperty <q>[public]::+ll1III111I11
        00057) + 3:1 getproperty <l,multi>{[public]""}
        00058) + 2:1 lessthan
        00059) + 1:1 iffalse ->81
        00060) + 0:1 getlocal r3
        00061) + 1:1 getlocal r5
        00062) + 2:1 getproperty <l,multi>{[public]""}
        00063) + 1:1 getlocal r4
        00064) + 2:1 bitxor       <- XOR
        00065) + 1:1 convert_u


Quickly written Python code for decoding and extracting the second SWF. The key will likely need to be modified.

# The Base91 code is written by Adrien Beraud
# https://github.com/aberaud/base91-python/blob/master/base91.py

# Base91 encode/decode
#
# Copyright (c) 2012 Adrien Beraud
# All rights reserved.
#
# Redistribution and use in source and binary forms, with or without
# modification, are permitted provided that the following conditions are met:
#
#   * Redistributions of source code must retain the above copyright notice,
#     this list of conditions and the following disclaimer.
#   * Redistributions in binary form must reproduce the above copyright notice,
#     this list of conditions and the following disclaimer in the documentation
#     and/or other materials provided with the distribution.
#   * Neither the name of Adrien Beraud, Wisdom Vibes Pte. Ltd., nor the names
#     of its contributors may be used to endorse or promote products derived
#     from this software without specific prior written permission.
#
# THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
# AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
# IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
# DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE
# FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
# DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
# SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
# CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
# OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
# OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
#  

import struct
import sys
import zlib

base91_alphabet = ['A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I', 'J', 'K', 'L', 'M',
 'N', 'O', 'P', 'Q', 'R', 'S', 'T', 'U', 'V', 'W', 'X', 'Y', 'Z',
 'a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'k', 'l', 'm',
 'n', 'o', 'p', 'q', 'r', 's', 't', 'u', 'v', 'w', 'x', 'y', 'z',
 '0', '1', '2', '3', '4', '5', '6', '7', '8', '9', '!', '#', '$',
 '%', '&', '(', ')', '*', '+', ',', '.', '/', ':', ';', '<', '=',
 '>', '?', '@', '[', ']', '^', '_', '`', '{', '|', '}', '~', '"']

decode_table = dict((v,k) for k,v in enumerate(base91_alphabet))

def decode(encoded_str):
    ''' Decode Base91 string to a bytearray '''
    v = -1
    b = 0
    n = 0
    out = bytearray()
    for strletter in encoded_str:
        if not strletter in decode_table:
            continue
        c = decode_table[strletter]
        if(v < 0):
            v = c
        else:
            v += c*91
            b |= v << n
            n += 13 if (v & 8191)>88 else 14
            while True:
                out += struct.pack('B', b&255)
                b >>= 8
                n -= 8
                if not n>7:
                    break
            v = -1
    if v+1:
        out += struct.pack('B', (b | v << n) & 255 )
    return out

def main():
    f = open(sys.argv[1], 'rb')
    x = f.read()
    d = decode(x)
    dd = ""
    key = 75
    for y in d:
        dd += chr(y ^ key)
        key = y
    o = zlib.decompress(dd)
    kk = open( sys.argv[1] + "-out.bin", "wb")
    kk.write(o)
    kk.close()

main()

output.bin is the binary data extracted using swfextract. The above Python code is stored in angler-decoder.py. After running the script the decoded SWF is saved to output.bin-out.bin. Then I use xxxswf.py to verify the SWF is present.


___*____$ python angler-decoder.py output.bin 
___*____$ ls
angler-decoder.py  c34266299460225c0354df5438417924579641095ffd7588a42d8fae07ae8511  output.bin  output.bin-out.bin  xxxswf.py
___*____$ python xxxswf.py output.bin-out.bin 

[SUMMARY] Potentially 1 SWF(s) in MD5 d41d8cd98f00b204e9800998ecf8427e:output.bin-out.bin
 [ADDR] SWF 1 at 0x0 - CWS Header
___*____$ python xxxswf.py -d output.bin-out.bin 

[SUMMARY] Potentially 1 SWF(s) in MD5 d41d8cd98f00b204e9800998ecf8427e:output.bin-out.bin
 [ADDR] SWF 1 at 0x0 - CWS Header
  [FILE] Carved SWF MD5: 5d4c794c3a3011da71cc31d5fd7015ce.swf

The extracted second SWF is also obfuscated.  

My "cleaned up" ActionScript

package 
{
    import flash.display.*;
    import flash.system.*;

    public class ExtendedMovieClipFunction extends MovieClip
    {
        private var DEUNCOMPRESSED_BUFFER:Object;
        private var _CLASS_BUFFER:Class;
        private var FuncNameToStrInstance:AssignFuncNameToString;
        private var int_0:uint = 0;
        private var _uint_0:uint = 0;
        private var _uint_255:uint = 255;
        private var _object2:Object;
        private var _object3:Object;

        public function ExtendedMovieClipFunction(param1:Object = null)
        {
            this.FuncNameToStrInstance = new AssignFuncNameToString();
            #  a SWF file from other domains than that of the Loader object can call Security.allowDomain() to
            #  permit a specific domain
            Security[this.FuncNameToStrInstance.allowDomain]("*");
            var _loc_3:* = ApplicationDomain[this.FuncNameToStrInstance.currentDomain];
            var  ldr:Loader :* = _loc_3[this.FuncNameToStrInstance.getDefinition](this.FuncNameToStrInstance.flash.display.Loader) as Class;
            this.DEUNCOMPRESSED_BUFFER = new  ldr:Loader ;
            this._CLASS_BUFFER = _loc_3[this.FuncNameToStrInstance.getDefinition](this.FuncNameToStrInstance.flash.utils.ByteArray) as Class;
            
            ## The Stage class represents the main drawing area.
            if (this[this.FuncNameToStrInstance.stage])
            {
                this.FuncEventListener();
            }
            else
            {
                this[this.FuncNameToStrInstance.addEventListener](this.FuncNameToStrInstance.addedToStage, this.FuncEventListener);
            }
            return;
        }// end function

        public function 1_object1(param1:Object, param2:int) : void
        {
            param2++;
            return;
        }// end function

        private function FuncEventListener(param1:Object = null) : void
        {
            this[this.FuncNameToStrInstance.removeEventListener](this.FuncNameToStrInstance.addedToStage, this.FuncEventListener);
            this[this.FuncNameToStrInstance.addEventListener](this.FuncNameToStrInstance.enterFrame, this.I1111IIIlllIl1);
            var _loc_2:* = new ExtendedByteArrayFunction();
            var DECODE_BUFFER:* = new this._CLASS_BUFFER();
            this.CONSTRUCT_KEY();
            this.BASE91(_loc_2, _loc_2[this.FuncNameToStrInstance.length], DECODE_BUFFER);
            this.func_123(DECODE_BUFFER);
            var _loc_4:* = 75;
            var INDEX:* = 0;
            
            // XOR loop 
            if (INDEX < DECODE_BUFFER[this.FuncNameToStrInstance.length])
            {
                var _loc_6:* = DECODE_BUFFER[INDEX] ^ _loc_4;
                _loc_4 = DECODE_BUFFER[INDEX];
                DECODE_BUFFER[INDEX] = _loc_6;
                INDEX++;
                ;
            }
            # XORs the data then uncompresses 
            DECODE_BUFFER[this.FuncNameToStrInstance.uncompress]();
            this.DEUNCOMPRESSED_BUFFER[this.FuncNameToStrInstance.loadBytes](DECODE_BUFFER);
            this[this.FuncNameToStrInstance.addChild](this.DEUNCOMPRESSED_BUFFER);
            ;
            var _loc_8:* = null;
            return;
            ;
            return;
        }// end function

        private function I1111IIIlllIl1(param1) : void
        {
            if (this.currentFrame == 200)
            {
                this.I1ll1III111I11(new Number(2));
                return;
            }
            return;
        }// end function

        # Create Key  
        private function CONSTRUCT_KEY() : void
        {
            this._object2 = new this._CLASS_BUFFER();
            this._object3 = new this._CLASS_BUFFER();
            var _loc_2:* = 0;
            _loc_2 = 65;
            
            if (_loc_2 < 91)
            {
                this._object3[this.FuncNameToStrInstance.writeByte](_loc_2);
                _loc_2++;
                ;
            }
            _loc_2 = 97;
            
            if (_loc_2 < 123)
            {
                this._object3[this.FuncNameToStrInstance.writeByte](_loc_2);
                _loc_2++;
                ;
            }
            _loc_2 = 48;
            
            if (_loc_2 < 58)
            {
                this._object3[this.FuncNameToStrInstance.writeByte](_loc_2);
                _loc_2++;
                ;
            }
            _loc_2 = 33;
            
            if (_loc_2 < 48)
            {
                
                
                if (_loc_2 == 34 || _loc_2 == 39 || _loc_2 == 45)
                {
                }
                else
                {
                    this._object3[this.FuncNameToStrInstance.writeByte](_loc_2);
                }
                _loc_2++;
                ;
            }
            _loc_2 = 58;
            
            if (_loc_2 < 65)
            {
                this._object3[this.FuncNameToStrInstance.writeByte](_loc_2);
                _loc_2++;
                ;
            }
            _loc_2 = 91;
            
            if (_loc_2 < 97)
            {
                if (_loc_2 == 92)
                {
                }
                else
                {
                    this._object3[this.FuncNameToStrInstance.writeByte](_loc_2);
                }
                _loc_2++;
                ;
            }
            _loc_2 = 123;
            
            if (_loc_2 < 127)
            {
                this._object3[this.FuncNameToStrInstance.writeByte](_loc_2);
                _loc_2++;
                ;
            }
            this._object3[this.FuncNameToStrInstance.writeByte](34);
            
            var _loc_3:* = 0;
            _loc_3 = 0;
            
            if (_loc_3 < 255)
            {
                this._object2[_loc_3] = 255;
                _loc_3++;
                ;
            }
            _loc_3 = 0;
            
            if (_loc_3 < this._object3[this.FuncNameToStrInstance.length])
            {
                this._object2[this._object3[_loc_3]] = _loc_3;
                _loc_3++;
                ;
            }
            return;
        }// end function

        public function func_123(param1) : uint
        {
            var _loc_2:* = 0;
            if (this._uint_255 != 255)
            {
                param1[param1[this.FuncNameToStrInstance.length]] = this.int_0 | this._uint_255 << this._uint_0;
                _loc_2 = _loc_2 + 1;
            }
            return _loc_2;
            return;
        }// end function

        public function BASE91(param1, _length:uint, param3) : uint
        {
            var _loc_4:* = 0;
            var _loc_5:* = 0;
            var _int_8191:* = 8191;
            _INDEX = 0;
            
            # previously IF
            while (_INDEX < _length)
            {
                if (this._object2[param1[_INDEX]] == 255)
                {
                }
                else
                {
                    if (this._uint_255 == 255)
                    {
                        this._uint_255 = this._object2[param1[_INDEX]];
                    }
                    else
                    {
                        #   _uint_255 =  _uint_255 + * len(_object3)
                        this._uint_255 = this._uint_255 + this._object2[param1[_INDEX]] * this._object3[this.FuncNameToStrInstance.length];
                        this.int_0 = this.int_0 | this._uint_255 << this._uint_0;
                        
                        this._uint_0 = this._uint_0 + ((this._uint_255 & _int_8191) > 88 ? (13) : (// label, 14));
                        
                        # increament _loc_8
                        var _loc_8:* = _loc_5;
                        _loc_5 = _loc_5 + 1;
                        
                        # move to out buffer
                        param3[_loc_8] = this.int_0 & 255;
                        
                        this.int_0 = this.int_0 >> 8;
                        this._uint_0 = this._uint_0 - 8;
                        if (this._uint_0 > 7) goto 160;
                        this._uint_255 = 255;
                    }
                }
                _INDEX++;
                ;
            }
            return _loc_5;
            return;
        }// end function

    }
}


Exploring the Top 100 ebooks of The Pirate Bay


I wrapped up an analysis of the Top 100 ebooks of the Pirate Bay.  Rather than posting to code I decided to use a notebook viewer. All the data and code can be found on my bit-bucket repo. Cheers.

The Beginner's Guide to IDAPython

In my spare time for the past couple of months I have been working on an ebook called "The Beginner's Guide to IDAPython". I originally wrote it as a reference for myself - I wanted a place to go to where I could find examples of functions that I commonly use (and forget) in IDAPython.  Since I started the book I have used it many times as a quick reference to understand syntax or see an example of some code. I hope others will find it equally useful.  The book is not a static document. I already have a list of content/topics that I would like to write about., like a cover.. Please feel free to email me if you would like a topic added, have a correction or would like to say hi. My email is a the bottom of the introduction . The ebook can be found  in the below link.


https://leanpub.com/IDAPython-Book

The price is free (move the slider to left) but has a suggested price of $14.99. In all honesty I don't care if you purchase it. A purchase would be nice but I'd rather you learn something from it.

Logistics
I wrote the book in markup language. I used StackEdit [1] as an editor. I paid for a sponsor account. This allowed me to download it in a PDF. I did version control and hosting via bit-bucket [2]. Not sure why Dan [3] and I are the only people on it.  Bit-Bucket is awesome. Unlimited free private repos for the win.  I'm using leanpub [4] as the distributor for my ebook. Ange Albertini is also using leanpub to publish an ebook called  Binary is beautiful [5]. Disclaimer his book will be way better than mine.

I'd like to thank Hexacorn for all his feedback and support.

1. https://stackedit.io/
2. https://bitbucket.org/
3. https://bitbucket.org/daniel_plohmann
4. http://leanpub.com/
5. https://leanpub.com/binaryisbeautiful

Updates:  Usual grammar issues...


Dyre IE Hooks

I recently wrapped up my analysis of Dyre. A PDF document can be found in my papers repo. Most of the document focuses on the different stages that  Dyre interacts with the operating system. There are still some areas that I'd like to dig deeper into. For now it should be a good resource for anyone trying to identify a machine infected with Dyre or wanting to know more about the family of malware.

During the reversing process I found one part of Dyre functionality worthy of a post. As with most banking trojans Dyre contains functionality to hook APIs to log browser traffic. Typically to get the addresses of the APIs the sample will call GetProcAddress or manually traverse the portable executable file format to resolve symbols. If you are unfamiliar with the later technique I'd highly recommend reading section 3.3 of "Understanding Windows Shellcode" by Skape [1]. Dyre attempts to hook APIs in firefox.exe, chrome.exe and iexplorer.exe. It uses the standard GetProcAddress approach for resolving symbols in firefox.exe, is unsuccessful in chrome.exe and uses the GetProcAddress approach for the APIs LoadLibraryExW and CreateProcessInternalW in iexplorer.exe. Dyre hooks two APIs in WinInet.dll but it does it in a unique way. Dyre will read the image header timedatestamp [2] from WinInet. This value contains the time and date from when Wininet was created by the linker during compiling.  It will then compare the timedatestamp to a list of timedatestamps stored by Dyre.  The list contains presumably every time stamp for WinInet.dll since '2004-08-04 01:53:22' to '2014-07-25 04:04:59'.  Below is an example of the values that can be found in the list.

seg000:00A0C05F           db    0
seg000:00A0C060 TimeStampList dd 4110941Bh              ; DATA XREF: TimeStamp:_loopr
seg000:00A0C064 dword_A0C064 dd 0                       ; DATA XREF: TimeStamp+1Cr
seg000:00A0C064                                         ; TimeStamp:loc_A07A0Dr ...
seg000:00A0C068           dd 411095F2h <- Time stamp
seg000:00A0C06C           dd 0         <- WinInet index
seg000:00A0C070           dd 4110963Fh
seg000:00A0C074           dd 0
seg000:00A0C078           dd 4110967Dh
seg000:00A0C07C           dd 0
seg000:00A0C080           dd 411096D4h
seg000:00A0C084           dd 0
seg000:00A0C088           dd 411096DDh
seg000:00A0C08C           dd 0
seg000:00A0C090           dd 41252C1Bh
seg000:00A0C094           dd 0
.....
seg000:00A0C0AC           dd 1
seg000:00A0C0B0           dd 435862A0h
seg000:00A0C0B4           dd 2
seg000:00A0C0B8           dd 43C2A6A9h
seg000:00A0C0BC           dd 3
....
seg000:00A0D230           dd 4CE7BA3Fh
seg000:00A0D234           dd 78h
seg000:00A0D238           dd 53860FB3h
seg000:00A0D23C           dd 79h
seg000:00A0D240           dd 53D22BCBh
seg000:00A0D244           dd 7Ah

Values converted to time

>>> datetime.datetime.fromtimestamp(0x411095F2).strftime('%Y-%m-%d %H:%M:%S')
'2004-08-04 01:53:22'

>>> datetime.datetime.fromtimestamp(0x53D22BCB).strftime('%Y-%m-%d %H:%M:%S')
'2014-07-25 04:04:59'    

If the timedatestamp is not present or an error occurs Dyre will send the hash of WinInet to the attackers server. If the hash is not found it will send WinInet back to the attackers. Below are some of the strings responsible for displaying errors for the command and control.

'/%s/%s/63/file/%s/%s/%s/'
"Check wininet.dll on server failed"
"Send wininet.dll failed"

If the timedatestamp is found in the list the next value is used as an index into another list. For example if the timedatestamp was 4802A13Ah it would be found at the 49th entry and the next value would be 0x15 or 21.

Data
seg000:00A0C1E8           dd 4802A13Ah  <- '2008-04-13 18:11:38'
seg000:00A0C1EC           dd 15h  <- 21 index

Assembly to read index value

seg000:00A07A0D           movsx   edx, word ptr ds:TimeStampIndex[eax*8] ; edx = 21
seg000:00A07A15           lea     edx, [edx+edx*2] ; edx  = 63
seg000:00A07A18           mov     edx, ds:offset[edx*4]
seg000:00A07A1F           mov     [ecx], edx            ; save off value

Python: calculate offset
Python>hex(0x0A0D3E0 + (21+21* 2) * 4)
0xa0d4dc

Read
seg000:00A0D4DC           dw 0F3Ch  0x0f3C offset to inline hook in wininet

The value 0xF3C + the base address of WinInet is the function prologue for ICSecureSocket::Send_Fsm. Dyre uses this to know the address to place it's hooks.

ICSecureSocket::Send_Fsm(CFsm_SecureSend *)
    
77200F37    90              NOP
77200F38    90              NOP
77200F39    90              NOP
77200F3A    90              NOP
77200F3B    90              NOP
77200F3C  - E9 C7F0398A     JMP 015A0008   <- Inline hook
015A0008    68 4077A000     PUSH 0A07740
015A000D    C3              RETN

00A07740    55              PUSH EBP
00A07741    8BEC            MOV EBP,ESP
00A07743    83EC 08         SUB ESP,8
00A07746    894D FC         MOV DWORD PTR SS:[EBP-4],ECX
00A07749    68 2077A000     PUSH 0A07720
00A0774E    FF75 08         PUSH DWORD PTR SS:[EBP+8]
00A07751    FF75 FC         PUSH DWORD PTR SS:[EBP-4]
00A07754    FF15 94DEA000   CALL DWORD PTR DS:[A0DE94]
00A0775A    8945 F8         MOV DWORD PTR SS:[EBP-8],EAX

It will also hooks ICSecureSocket::Receive_Fsm in the same fashion.

Closing 
Rather than calling GetProcAddress (the hooked APIs are not exportable) Dyre stores the timedatestamp and patch offset of every known version of WinInet to avoid triggering heuristic based scanners. Seems like an arduous approach but still kind of cool. Another interesting fact is Dyre has the ability to patch Trusteer's RapportGP.dll if found in the browser memory. Dyre is actually a family of malware worthy of a deep dive. At first glance I ignored it because everything looked pretty cut & paste. I'd recommend others to check it out. If you find anything useful please shoot me an email. Cheers.

Hash Analyzed 099c36d73cad5f13ec1a89d5958486060977930b8e4d541e4a2f7d92e104cd21
  1. http://www.nologin.org/Downloads/Papers/win32-shellcode.pdf
  2. http://msdn.microsoft.com/en-us/library/ms680313.aspx

reg+displ

I have been reversing Dyre in my spare time. I'm hoping to have a full analysis out in the next week or two. Something kind of annoying about Dyre is it uses what looks like a massive structure to store it's data and function pointers. For example in the image below we can see it it passing a handle stored at [eax+0x130] to WaitForSingleObject.
Manually tracing the code or searching for all cross references is kind of painful to find what populated the value. Since the displacement is kind of unique due to it's value of 0x130 or 304 it can be targeted very easily in IDAPython.

import idautils 
import idaapi
displace = {}

# for each known function 
for func in idautils.Functions():
    flags = idc.GetFunctionFlags(func)
    # skip library & thunk functions 
    if flags & FUNC_LIB or flags & FUNC_THUNK:
        continue  
    dism_addr = list(idautils.FuncItems(func))
    for curr_addr in dism_addr:
        op = None
        index = None 
        # same as idc.GetOptype, just a different way of accessing the types
        idaapi.decode_insn(curr_addr)
        if idaapi.cmd.Op1.type == idaapi.o_displ:
            op = 1
        if idaapi.cmd.Op2.type == idaapi.o_displ:
            op = 2
        if op == None:
            continue 
        if "bp" in idaapi.tag_remove(idaapi.ua_outop2(curr_addr, 0)) or \
               "bp" in idaapi.tag_remove(idaapi.ua_outop2(curr_addr, 1)):
            # ebp will return a negative number
            if op == 1:
                index = (~(int(idaapi.cmd.Op1.addr) - 1) & 0xFFFFFFFF)
            else:
                index = (~(int(idaapi.cmd.Op2.addr) - 1) & 0xFFFFFFFF)
        else:
            if op == 1:
                index = int(idaapi.cmd.Op1.addr)
            else:
                index = int(idaapi.cmd.Op2.addr)
        # create key for each unique displacement value 
        if index:
            if displace.has_key(index) == False:
                displace[index] = []
            displace[index].append(curr_addr)
The above code will create a dictionary of all the displacement values in known functions. A simple for loop can be used to find the address and disassembly of all uses for the defined displacement value.
Python>for x in displace[0x130]: print hex(x), GetDisasm(x)
0x10004f12 mov     [esi+130h], eax
0x10004f68 mov     [esi+130h], eax
0x10004fda push    dword ptr [esi+130h]  ; hObject
0x10005260 push    dword ptr [esi+130h]  ; hObject
0x10005293 push    dword ptr [eax+130h]  ; hHandle
0x100056be push    dword ptr [esi+130h]  ; hEvent
0x10005ac7 push    dword ptr [esi+130h]  ; hEvent
Python>
With the addresses it makes it easy to find where the value is populated.


The dictionary created by the script is named displace. It will contain all displaced values.  Not super 1337 but still useful. Cheers.

Backtrace POC - Stack Strings

Example 1 Hex View
There are a number of tools that cover char strings in IDA. If you are not familiar with char strings it's a low hanging obfuscation technique to thwart analyst from viewing the strings inside of an executable. Some notable tools and posts on this topic are [1] & [2]. In the image above you can see the string DBG. Odds are if we were viewing the executable in a hex editor or using strings this wouldn't stick out.

Example 1 Assembly View
If we were watching the stack of the executable at run time we would see something constructed similar to the string/comment above.
Example 2
 The code can be run in two modes the first is by selecting the code and the double clicking the script in IDA (ALT+F9). In the example above we can see the string "W32Time". My code attempts to reconstruct the stack memory. The buffer can be accessed via a list object.str_buff. In the Output window above you can see the content of the buffer dumped to standard out. This makes it easy to format the data and access it via an index. The commented data is an example of how the string would look on the stack in Ollydbg. The second way to execute the code is to pass an address within a function to object.run( address ). This will try to rebuild the stack for the whole function. All of this is done statically. Char strings that are populated via registers (such as mov [ebp+var_c], bl when bl is 0x4f in the example 1 image) are traced back using backtrace.py. For more details on backtrace please see the the following link.

As previously mentioned this topic has already been covered. I'm posting this code because it's a good example of using backtrace.py. I had fun working on this one. The code handles all examples I have found so far. There is an issue with formatting constructed wide char strings. Not exactly sure of the best approach. I tried to keep the data flexible so it should be easy to write a function to format the data.

[1]. Automatic Recovery of Constructed Strings in Malware by Jay Smith of FireEye - link
[2]. Finding Byte Strings using IDAPython by Jason Jones of Arbor Networks - link 

Repo - Link

Code for reviewing

"""
Author:
    Alexander Hanel 
Date:
    20140902
Version:
    1  - should be good to go.
Summary:
    Examples of using the backtrace library to rebuild strings

TODO:
    * How to deal with printing wide char strings?
    * What is the size of the frame buffer if GetFrameSize returns something
      smaller than the frame/stack index or the IDA does not recognize the function?

Notes:
    idaapi.o_phrase # Memory Ref [Base Reg + Index Reg]
    o_phrase   =  idaapi.o_phrase    #  Memory Ref [Base Reg + Index Reg]    phrase
    o_displ    =  idaapi.o_displ     #  Memory Reg [Base Reg + Index Reg + Displacement] phrase+addr

Useful Reads
    http://smokedchicken.org/2012/05/ida-rename-local-from-a-script.html
    http://zairon.wordpress.com/2008/02/15/idc-script-and-stack-frame-variables-length/
"""
import sys, os, logging, copy
from binascii import unhexlify
# Add the parent directory to Python Path
sys.path.append(os.path.realpath(__file__ + "/../../"))
# import the backtrace module
from backtrace import *

class Frame2Buff:
    def __init__(self):
        self.verbose = False
        self.func_start = idc.SelStart()
        # SelEnd() returns the following selected instruction
        self.func_end = SelEnd()
        self.esp = False
        self.ebp = False
        self.comment = True
        self.frame_size = None
        self.bt = None
        self.str_buff = None
        self.comment = True
        self.formatted_buff = ""
        self.format = True

    def run(self, func_addr=None):
        """ run and create Frame2Buff"""
        # check if code is selected or if using the whole function
        if self.func_start == BADADDR or self.func_end == BADADDR:
            if func_addr == None:
                if self.verbose:
                    print "ERROR: No addresses selected or passed"
                return None
        if func_addr:
            self.func_start = idc.GetFunctionAttr(func_addr, FUNCATTR_START)
            self.func_end = idc.GetFunctionAttr(func_addr, FUNCATTR_END)
        if self.func_start == BADADDR:
            if self.verbose:
                print "ERROR: Invalid address"
        self.frame_size = GetFrameSize(self.func_start)
        try:
            self.bt = Backtrace()
            self.bt.verbose = False
        except ImportError:
            print "ERROR: Could not import Backtrace - aborting"
        self.func_end = PrevHead(self.func_end)
        self.populate_buffer()
        if self.format:
            self.format_buff()
        if self.comment:
            self.comment_func()

    def populate_buffer(self):
        curr_addr = self.func_start
        self.str_buff = list('\x00' * self.frame_size)
        while curr_addr <= self.func_end:
            index = None
            idaapi.decode_insn(curr_addr)
            # check if instr is MOV, [esp|ebp + index], variable
            if idaapi.cmd.itype == idaapi.NN_mov and idaapi.cmd.Op1.type == idaapi.o_displ:
                if "bp" in idc.GetOpnd(curr_addr, 0):
                    # ebp will return a negative number
                    index = (~(int(idaapi.cmd.Op1.addr) - 1) & 0xFFFFFFFF)
                    self.ebp = True
                else:
                    index = int(idaapi.cmd.Op1.addr)
                    self.esp = True
                if idaapi.cmd.Op2.type == idaapi.o_reg:
                    # value needs to be traced back
                    self.bt.backtrace(curr_addr, 1)
                    # tainted means the reg was xor reg, reg
                    # odds are being used to init var.
                    if self.bt.tainted != True:
                        last_ref = self.bt.refsLog[-1]
                        idaapi.decode_insn(int(last_ref[0]))
                        data = idaapi.cmd.Op2.value
                    else:
                        # tracked variable has been set to zero by xor reg, reg
                        curr_addr = idc.NextHead(curr_addr)
                        continue
                elif idaapi.cmd.Op2.type != idaapi.o_imm:
                    curr_addr = idc.NextHead(curr_addr)
                    continue
                else:
                    data = idaapi.cmd.Op2.value
                if data:
                    try:
                        hex_values = hex(data)[2:]
                        if hex_values[-1] == "L":
                            hex_values = hex_values[:-1]
                        if len(hex_values) % 2:
                            hex_values = "0" + hex_values
                        temp = unhexlify(hex_values)
                    except:
                        if self.verbose:
                            print "ERROR: Unhexlify Issue at %x %s (not added)" % (curr_addr, idc.GetDisasm(curr_addr))
                        curr_addr = idc.NextHead(curr_addr)
                        continue
                else:
                    curr_addr = idc.NextHead(curr_addr)
                    continue
                # GetFrameSize is not a reliable buffer size
                # If so append to buffer if index is less than
                # 2 * frame size. If more likely an error
                if self.ebp or self.esp:
                    cal_index = index + len(temp)
                    if cal_index > self.frame_size:
                        if cal_index < (self.frame_size * 2):
                            for a in range(cal_index - self.frame_size):
                                self.str_buff.append("\x00")
                                if self.verbose:
                                    print "ERROR: Frame size incorrect, appending"
                if self.ebp:
                    # reverse the buffer
                    temp = temp[::-1]
                    for c, ch in enumerate(temp):
                        try:
                            self.str_buff[index - c] = ch
                        except:
                            if self.verbose:
                                print "ERROR: Frame EBP index invalid: at %x" % (curr_addr)
                if self.esp:
                    for c, ch in enumerate(temp):
                        try:
                            self.str_buff[index + c] = ch
                        except:
                                print "ERROR: Frame ESP index invalid: at %x" % (curr_addr)
            curr_addr = idc.NextHead(curr_addr)
        # reverse the buffer to match index
        if self.ebp == True:
            self.str_buff = self.str_buff[::-1]
            self.str_buff.pop()



    def format_buff(self):
        self.formatted_buff = ""
        temp_buff = copy.copy(self.str_buff)

        if self.ebp == True:
            temp_buff = temp_buff[::-1]
            temp_buff.pop()

        if self.str_buff:
            for index, ch in enumerate(temp_buff):
                try:
                    if ch == "\x00" and temp_buff[index + 1] != "\x00":
                        self.formatted_buff += " "
                except:
                    pass
                if ch != "\x00":
                    self.formatted_buff += ch

    def comment_func(self):
        idc.MakeComm(self.func_end, self.formatted_buff)

"""
Example:
    Create a buffer of the whole function

x = Frame2Buff()
x.run(here())  # func adddr

"""
x = Frame2Buff()
x.run() # select data