Avatar

Blog (pg. 20)

  • Published on
    To generate random passwords I have written a simple class in VB.NET, which accepts a string containing valid password characters and then generates passwords of n length using the specified characters.
    Public Class PasswordGenerator
            Private _pwChars As String
    
            Public Sub New()
                MyClass.New("abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789")
            End Sub
    
            Public Sub New(ByVal validChars As String)
                _pwChars = validChars
            End Sub
    
            Public Function GeneratePassword(ByVal length As Integer) As String
                Dim password As String = ""
                Dim rndNum As New Random()
    
                For i As Integer = 1 To length
                    password &= _pwChars.Substring(rndNum.Next(0, _pwChars.Length - 1), 1)
                Next
    
                Return password
            End Function
    
        End Class
  • Published on
    If you are trying to Make Thread-Safe Calls to Windows Forms Controls then you can run into deadlock problems when calling 'Invoke' from your thread, if your main thread is waiting for a Thread.Join on the calling thread. I posted a workaround for this on the Microsoft site, which uses an extra thread to make the call to Invoke method, to avoid the deadlock. VB.NET example:
    Private Sub SetText(ByVal str_text As String)
            If Me.textbox1.InvokeRequired = True Then
                    'call an asyncronous invoke, by calling it then forcing a DoEvents
                    Dim asyncInvokeThread As New Threading.Thread(New Threading.ParameterizedThreadStart(AddressOf AsyncInvoke))
                    asyncInvokeThread.IsBackground = True
                    asyncInvokeThread.Start(str_text)
    
                    Application.DoEvents()
            Else
                    me.textbox1.text=str_text
            End If
    End Sub
    Private Sub AsyncInvoke(ByVal obj_text As Object)
            Dim str_text As String = CStr(obj_text)
            Dim d As New SetTextCallback(AddressOf SetText)
            Me.Invoke(d, New Object() {str_message})
    End Sub
  • Published on
    I joined in a "competition" with some of the experts in the assembly forums over at experts-exchange to write the fastest assembly code Base64 encoder, following the guidelines of the RFC3548. Several versions emerged, some using lookup tables to precompute results, which faired well on CPUs which large caches. My version was more about opcode optimization and using bitwise arithmetic over decimal arithmetic, which achieved best results on CPUs that have better pipelining. Code:
      void ToBase64( BYTE* pSrc, char* pszOutBuf, int len )
    {
          char* chr_table="ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+/";
    
           __asm{
                      mov ecx, len
                      mov esi, pSrc                              //;bytes from source
                      mov edi, chr_table
                      push ebp
                      mov ebp, pszOutBuf
    
    src_byteLoop:
    
                      xor eax, eax
    
                      //;read 3 bytes
                      mov ah, byte ptr[esi]
                      mov al, byte ptr[esi+1]
                      shl eax, 16
                      mov ah, byte ptr[esi+2]
    
                      //;manipulate in edx bitset1
                      mov edx, eax
                      shl eax, 6                                    //;done first 6 bits
    
                      shr edx, 26            
                      mov bl, byte ptr [edi+edx]            //;put char in buffer
                      mov byte ptr[ebp], bl
                      inc ebp                                          //;next buf
    
                      //;manipulate in edx bitset2
                      mov edx, eax
                      shl eax, 6                                    //;done first 6 bits
    
                      shr edx, 26
                      mov bl, byte ptr [edi+edx]            //;put char in buffer
                      mov byte ptr[ebp], bl
                      inc ebp                                          //;next buf
    
                      //;manipulate in edx bitset3
                      mov edx, eax
                      shl eax, 6                                    //;done first 6 bits
    
                      shr edx, 26
                      mov bl, byte ptr [edi+edx]            //;put char in buffer
                      mov byte ptr[ebp], bl
                      inc ebp                                          //;next buf
    
                      //;manipulate in edx bitset4
                      mov edx, eax
                      shl eax, 6                                    //;done first 6 bits
    
                      shr edx, 26
                      mov bl, byte ptr [edi+edx]            //;put char in buffer
                      mov byte ptr[ebp], bl
                      inc ebp                                          //;next buf
    
                      //;done these bytes
                      add esi, 3
                      sub ecx, 3
    
                      cmp ecx, 3
                      jge src_byteLoop                        //;still got src bytes
    
                      xor eax, eax                              //;set to zero (pad count)
                      cmp ecx, 0
                      jz finished
    
                            //;need to pad out some extra bytes
    
                            //;read in 3 bytes regardless of junk data following pSrc - already zero from above)
                            mov ah, byte ptr[esi]
                            mov al, byte ptr[esi+1]
                            shl eax, 16
                            mov ah, byte ptr[esi+2]
    
                            sub ecx, 3                                    //;bytes just read
                            neg ecx                                          //;+ve inverse
                            mov edx, ecx                              //;save how many bytes need padding
    
                            //;as per the RFC, any padded bytes should be 0s
                            mov esi, 0xFFFFFF
                            lea ecx, dword ptr[ecx*8+8]            //;calculate bitmask to shift
                            shl esi, cl
                            and eax, esi                              //;mask out the junk bytes
    
                            mov ecx, edx                              //;restore pad count
    
                            //;manipulate in edx byte 1
                            mov edx, eax
                            shl eax, 6                                    //;done first 6 bits                        
    
                            shr edx, 26
                            mov bl, byte ptr [edi+edx]            //;put char in buffer
                            mov byte ptr[ebp], bl
                            inc ebp                                          //;next buf
    
                            //;manipulate in edx byte 2
                            mov edx, eax
                            shl eax, 6                                    //;done first 6 bits                        
    
                            shr edx, 26
                            mov bl, byte ptr [edi+edx]            //;put char in buffer
                            mov byte ptr[ebp], bl
                            inc ebp                                          //;next buf
    
                            //;manipulate in edx byte 3
                            mov edx, eax
                            shl eax, 6                                    //;done first 6 bits                        
    
                            shr edx, 26
                            mov bl, byte ptr [edi+edx]            //;put char in buffer
                            mov byte ptr[ebp], bl
                            inc ebp                                          //;next buf
    
                            //;manipulate in edx byte 3
                            mov edx, eax
                            shl eax, 6                                    //;done first 6 bits                        
    
                            shr edx, 26
                            mov bl, byte ptr [edi+edx]                  //;put char in buffer
                            mov byte ptr[ebp], bl
                            inc ebp                                          //;next buf
    
                            mov eax, ecx                              //;'return' pad count
    
    finished:
                      test eax, eax
                      jz end
                      //;some bytes were padding, put them as =
    
                            sub ebp, eax                        //;move ptr back for num bytes to pad
    padChars:
                            mov byte ptr[ebp], 0x3d            //;=
                            inc ebp
                            dec eax
    
                            jnz padChars
    
    end:
                      pop ebp
            }      
    }
    There were several key points to my optimization technique:
    • Firstly unrolling of loops, which basically means eliminating any kind of loop structures in the code, in favour of manually typing the code to perform the operations on the data. In terms of asm optimizations this reduces branching and shifting EIP too often.
    • Another key point, was minimizing the load/store operations performed. To keep these down, you can use indirect addressing to get to the byte in memory you are interested in.e.g. mov edx, base add edx, edi mov bl, byte ptr[edx] becomes mov bl, [edx+edi]
    • As Base64 encoding requires you to work at bit level using 3 bytes at a time, you can speed up the conversion of the bit sets by using the binary arithmetic operations of SHL and SHR (shift left/right) to progressively take the most significant bits of one DWORD and operate on them with a lower significance by shifting them.For example, if EAX contains a DWORD and you copy that the EDX. You can get rid of the left most 6 bits of EAX by shifting left 6. You can deal with those 6 bits in EDX, by shifting EDX right 26 times. This is much quicker than shifting out one bit at a time.
    My final throughput on the official test machine was 196.92 MB/s, which is almost double the throughput of the CryptoAPI function, which came in at 104.05 MB/s
  • Published on
    When writing the "UltraChat" program, at university, I needed to write a port scanner that would look for running UltraChat servers. A port scanner basically attempts to connect to a host on a given port and return true or false if a connection was (or wasn't) established. The problem with doing this is that when a connection cannot be establish, Winsock (windows sockets) will keep trying, allowing for servers that are slow to reply. Which then slows the whole process down. To get around this I used a simple threading and timer technique to kill the request if it doesnt succeed within a given time. Code:
    bool quickConnect(char* remoteHostIP, int port)
    {
          struct sockaddr_in sAddress;
    
          //set up connection info
          sAddress.sin_family = AF_INET;
          sAddress.sin_port = htons(port);                                                      //port
          sAddress.sin_addr.S_un.S_addr = inet_addr(remoteHostIP);                  //IP
    
          //use a thread to do the connection
          HANDLE tConnThread;
          DWORD threadID;
          DWORD tExitCode=0;
    
          tConnThread=CreateThread(NULL,0,&quickConnectTTL,&sAddress,0,&threadID);
    
          //now wait 1seconds maximum for response
          SYSTEMTIME now;
          GetSystemTime(&now);
    
          int finishTime=(now.wDay * 24 * 60)+(now.wHour * 60)+(now.wSecond)+1;
          int nowTime=0;
    
          while(nowTime<finishTime){
                GetSystemTime(&now);
                nowTime=(now.wDay * 24 * 60)+(now.wHour * 60)+(now.wSecond);
                //check if already exited, dont waste time
                GetExitCodeThread(tConnThread,&tExitCode);
                if(tExitCode!=STILL_ACTIVE){
                      break;
                }
          }
    
          //get the return value from connection
          GetExitCodeThread(tConnThread,&tExitCode);
    
          //if thread did not exit, time is up, close the thread and assume no server
          if(tExitCode==STILL_ACTIVE){
                TerminateThread(tConnThread,0);
                tExitCode=0;
          }
    
          CloseHandle(tConnThread);
    
          bool present;
          //check return values
          if(tExitCode==1){
                present=true;
          }
          else{
                present=false;
          }
    
          return present;
    
    }
    
    DWORD WINAPI quickConnectTTL(LPVOID sAddressPTR){
          //copy struct info from PTR
          struct sockaddr_in sAddress;
          memcpy(&sAddress,sAddressPTR,sizeof(struct sockaddr_in));
    
          //connect to remote host
          int concode;
          SOCKET s=socket(AF_INET,SOCK_STREAM,0);
          concode=connect(s ,(struct sockaddr *)&sAddress,sizeof(sAddress));
    
          if(concode==SOCKET_ERROR){
                ExitThread(0);
                return 0;
          }
          else{
                closesocket(s);
                ExitThread(1);
                return 1;
          }
    }
    The above example waits 1 second for a reply, and then assumes a false result. To use the code, you need to include the windows sockets headers and initialise the winsock, then call quickConnect passing the host/port.
  • Published on
    Sometimes it is necessary to extend the functionality of a program to which you don't have the source code. This tutorial focuses on adding functionality to a compiled Windows Portable Executable file, but the idea can probably be implemented on other compiled binaries. Adding simple functionality can be achieved directly in 'code caves' within the PE file by adding the assembly code directly to the exe to perform the operation (i.e. use a hex editor with assembler to render the opcodes directly into some empty space in the exe file) The more complex the functionality, the more assembly code this will require, which needs bigger code caves and takes more coding on your part. You can create bigger code caves by adding new sections to the file, but that is not the topic of this tutorial. An easier way, I have found, is to code the extra functionality in a DLL, which is best coded in C++/C as these are easier to import into the target app. You should use the extern "C" macro on your exports to make them asm friendly (no mangling) and specify 'declspec(dllexport)' to make the compiler put the function in the exports.. For example: (AddOns.h)
    #ifdef ADDONS_EXPORTS
    #define ADDONS_API extern "C" __declspec(dllexport)
    #else
    #define ADDONS_API __declspec(dllimport)
    #endif
    
    ADDONS_API bool someFunction(LPCTSTR someTextParam);
    You can code the core of the functionality in the methods defined in your DLL and then the only assembly you need to code is loading your DLL and calling the functions. The above function for example, will be in a DLL called 'AddOns.dll' and take 1 paramater (a pointer to a string). In assembly code you can load the dll using the LoadLibrary function and find the address of the function using GetProcAddress. If the exe does not import these functions, read my other blog post on Finding the address of GetProcAddress.
    push &"AddOns.dll"                ; address of string for the DLL, in a code cave
    call &LoadLibraryA                ; address of the imported/loaded LoadLibraryA function
    push &"someFunction"              ; address of string for the function, in a code cave
    push eax                          ; HMODULE AddOns.dll (returned from LoadLibrary)
    call &GetProcAddressA             ; address of the imported/loaded GetProcAddressA function
    push &"SomeString"                ; address in the app to the parameter you want to pass
    call eax                          ; call the function in the dll (returned from GetProcAddress)
    add esp, 4                        ; clear the stack
    jmp &returnAddress                    ; go back to the original code?
    Since C functions use the cdecl calling convention by default, you have to clean the stack of your params when you are finished and if your function returns a value, it will be in EAX. Once you have put the above code in a cave somewhere, you simply need to jmp to it at the appropriate point in the original exe and then have it jmp back to the best place in the code to give control back to the normal program flow when you have finished.