反汇编代码中的优化方式(Optimization in disassembly code)

反汇编代码中的优化方式

一优化方式

优化方式分类

汇编中的加法、减法、乘法、除法 、 取模等等 都是优化方式。

优化方式的分类

  • 常量折叠
  • 常量传播
  • 变量去除
  • 归并优化
  • Cpu流水线优化
  • 数学变换
  • 不可达分支优化
  • 代码外提优化
    优化前提是在Release下且开启o2选项化速度的前提Debug版本也会优化更多的是方便程序员调试。所以在不影响调试的前提下才会进行优化。

常量折叠

int n = 0;
int m = 1;
printf("%d",7+8);
printf("%d",n+6);
printf("%d",n+m);

所谓常量折叠就是在编译前所遇到的常量。是可以就进行计算的。那么就会优化为一个常量值。

例如上面的7+8不会产生add指令 而是在程序编译后直接称为15(0XF)

printf("%d", 7 + 8);
00671873  push        0Fh  
00671875  push        offset string "%d" (0677B30h)  
0067187A  call        _printf (06710CDh)  
0067187F  add         esp,8  
printf("%d", n + 6);
00671882  mov         eax,dword ptr [n]  
00671885  add         eax,6  
00671888  push        eax  
00671889  push        offset string "%d" (0677B30h)  
0067188E  call        _printf (06710CDh)  
00671893  add         esp,8  
printf("%d", n + m);
00671896  mov         eax,dword ptr [n]  
00671899  add         eax,dword ptr [m]  
0067189C  push        eax  
0067189D  push        offset string "%d" (0677B30h)  
006718A2  call        _printf (06710CDh)  
006718A7  add         esp,8  

常量折叠含义就是常量会在编译器给你算出来。

常量传播

常量传播也叫常量扩散 指的就是 变量在写入或者读取的时候没有传递内存地址(&)也没有传指针或者引用来修改值得时候就会发生常量传播。
说白了就是 你没有修改我变量得代码。那么这个变量我就可以认为是常量。

int n = 0;
printf("%d",n+6);

那么进行常量传播之后,n因为没有对其修改。也没有对其进行传地址得操作。所以编译器就会把它变为常量。

int n = 0;
printf("%d",0+6);

而看到这里想大家明白了 又符号常量折叠所以代码继续变化。

int n = 0 ;
printf("%d",6);

这里进行了两次优化,一次是常量传播 一次是常量折叠

变量去除

变量去除是指你程序中定义了变量但是对其没有进行修改。然后进行常量传播,常量折叠一步一步给优化掉

int n = 0;
int m = 1;
printf("%d",n+m);

程序首先发现n m 两个变量相加。但是看了一下上面。发现没有对其进行修改。所以代码就会改变如下:

printf("%d",0+1);

而0+1符号常量折叠。所以最终代码就变为了

printf("%d",1);

对应反汇编

.text:00401018                 push    1
.text:0040101A                 push    offset aD       ; "%d"
.text:0040101F                 call    _printf

归并优化

归并优化。如果可以一起优化那么我就一起优化。
我们知道printf属性C调用约定。所以需要外平栈而且它是可变参。通过你push参数个数得不同。外平栈的大小也会相应改变
比如:

.text:00401040 sub_401040      proc near               ; CODE XREF: start-8D↓p
.text:00401040                 push    0Fh
.text:00401042                 push    offset unk_417A8C
.text:00401047                 call    sub_401010
.text:0040104C                 push    6
.text:0040104E                 push    offset unk_417A8C
.text:00401053                 call    sub_401010
.text:00401058                 push    1
.text:0040105A                 push    offset unk_417A8C
.text:0040105F                 call    sub_401010
.text:00401064                 add     esp, 18h
.text:00401067                 xor     eax, eax
.text:00401069                 retn
.text:00401069 sub_401040      endp

一个参数是4字节。所以累计总共push了6个参数4*6=24个字节。所以平栈也需要24个字节

Cpu流水线优化

Cpu流水线优化其实说白了就是打乱指令执行顺序。而不影响原有功能。这个在我们反汇编的时候需要注意。正常的汇编代码都是平平整整顺顺虚虚 让人一眼看的很舒服,而且反汇编编出高级代码也很快

这里以简单的汇编为例子 因为我们要很多代码才会遇到流水线优化。这里我模拟一下

正常的汇编代码指令顺序

xor eax,eax
xor ebx,ebx
xor ecx,ecx
mov eax,1
add eax,2
mov ebx,eax
mov ecx,3

打乱的流水线

xor eax,eax
mov eax,1
xor ecx,ecx
add eax,2
mov ecx,3
xor ebx,ebx
mov ebx,eax

汇编代码就很简单。
我们着重看一下打乱的汇编
在没打乱之前代码平平整整。打乱之后发现很多汇编进行了穿插
比如

mov eax 1
xor ecx,ecx

在Cpu执行mov eax,1的时候。可以直接执行xor ecx,ecx这样的好处是下行汇编不依赖上一行汇编。之前的指令是下一行指令依赖上一行指令。那么Cpu如果在执行第二行的时候发现你依赖于上一行汇编那么就会等待。

mov eax,1
add eax,2

第一行执行了mov eax,1 那么第二行又使用了eax。那么第二行执行的时候就要等待第一行。
而打断的好处就是 我执行第一行的时候也可以执行第二行而且不会影响你的结果。也可以提升速度。

数学变换优化

数学变化优化:如果操作的数是无意义的。那么就会进行优化.

i=10;
b=11;
i=b+10;
i=b-0;
i=b*3;
i=i/3;

那么以上高级代码直接进行优化 优化为

i=b;

不可达分支优化

不可达分支则是分支永远都不会走。那么也不会产生汇编代码。也没有存在的意义

a=10;
if(a==10)
{
xxxx
}
else
{
xxxx
}

上面已经知道a就是个常量。值就是10.那么会走if块。而else永远不会走。那么就会将else优化掉。当然实际情况中代码肯定不多。不会像我一样简单的写一个a=10去判断。

代码外提优化

所谓代码外提是在循坏的时候进行优化。循环体内部没有产生修改此变量的代码。就会进行优化

int x = xxx;
while(x>y/3)
{
xxxx....
x--;
}

循环体内部并没有操作y/3所以这个值都会放到外面执行则会优化为

t= y/3
while(x>t)
{
xxxx....
x---;
}

而t变量很可能也会经过上面的几种优化变为寄存器变量

去掉优化方式

代码混淆与优化是对立的。所以学习一下优化方便我们更好的人肉优化混淆代码 .
上面所说。都是编译器已经识别到了你的程序没有传地址传地址等。所以我们想办法就是不让他优化。

————————

Optimization in disassembly code

I. optimization method

Classification of optimization methods

Addition, subtraction, multiplication, division, modulus and so on in the assembly are all optimization methods.

Classification of optimization methods

  • constant folding
  • Constant propagation
  • Variable removal
  • Merge optimization
  • CPU pipeline optimization
  • Mathematical transformation
  • Unreachable branch optimization
  • Code extraction optimization
    The premise of optimization is that the debug version will also be optimized under release and the O2 option speed is turned on, which is more convenient for programmers to debug. Therefore, the optimization will be carried out without affecting the debugging.

constant folding

int n = 0;
int m = 1;
printf("%d",7+8);
printf("%d",n+6);
printf("%d",n+m);

The so-called < / strong > constants encountered before compilation are < / strong >. It can be calculated. Then it will be optimized to a constant value.

For example, the above 7 + 8 does not generate the < strong > Add < / strong > instruction, but is directly called 15 (0xf) after the program is compiled

printf("%d", 7 + 8);
00671873  push        0Fh  
00671875  push        offset string "%d" (0677B30h)  
0067187A  call        _printf (06710CDh)  
0067187F  add         esp,8  
printf("%d", n + 6);
00671882  mov         eax,dword ptr [n]  
00671885  add         eax,6  
00671888  push        eax  
00671889  push        offset string "%d" (0677B30h)  
0067188E  call        _printf (06710CDh)  
00671893  add         esp,8  
printf("%d", n + m);
00671896  mov         eax,dword ptr [n]  
00671899  add         eax,dword ptr [m]  
0067189C  push        eax  
0067189D  push        offset string "%d" (0677B30h)  
006718A2  call        _printf (06710CDh)  
006718A7  add         esp,8  

Constant folding means that the constant will be calculated by the compiler.

Constant propagation

Constant propagation, also known as constant diffusion, means that a variable does not pass a memory address when it is written or read Constant propagation occurs when there is no pointer or reference to modify the value.
To put it bluntly, you didn’t modify my variable code. Then I can think of this variable as a constant.

int n = 0;
printf("%d",n+6);

After constant propagation, n because it has not been modified. There is no address transfer operation. So the compiler turns it into a constant.

int n = 0;
printf("%d",0+6);

After seeing this, I want you to understand that the symbol constant is folded, so the code continues to change.

int n = 0 ;
printf("%d",6);

There are two optimizations, one is constant propagation and the other is constant folding

Variable removal

< strong > variable removal means that variables are defined in your program but not modified. Then carry out constant propagation, and optimize the constant folding step by step < / strong >

int n = 0;
int m = 1;
printf("%d",n+m);

The program first finds that N and m variables are added. But I looked up. It was found that it was not modified. So the code will change as follows:

printf("%d",0+1);

The 0 + 1 symbolic constant is collapsed. So eventually the code becomes

printf("%d",1);

Corresponding disassembly

.text:00401018                 push    1
.text:0040101A                 push    offset aD       ; "%d"
.text:0040101F                 call    _printf

Merge optimization

Merge and optimize. If we can optimize together, then I will optimize together.
We know the calling convention of < strong > printf < / strong > attribute C. Therefore, you need < strong > external stack < / strong > and it is a variable parameter. Through you, the number of push parameters is different. The size of the outer stack will also change accordingly
For example:

.text:00401040 sub_401040      proc near               ; CODE XREF: start-8D↓p
.text:00401040                 push    0Fh
.text:00401042                 push    offset unk_417A8C
.text:00401047                 call    sub_401010
.text:0040104C                 push    6
.text:0040104E                 push    offset unk_417A8C
.text:00401053                 call    sub_401010
.text:00401058                 push    1
.text:0040105A                 push    offset unk_417A8C
.text:0040105F                 call    sub_401010
.text:00401064                 add     esp, 18h
.text:00401067                 xor     eax, eax
.text:00401069                 retn
.text:00401069 sub_401040      endp

One parameter is 4 bytes. Therefore, a total of 6 parameters are pushed, 4 * 6 = 24 bytes. So the flat stack also needs 24 bytes

CPU pipeline optimization

< strong > CPU pipeline optimization is to disrupt the order of instruction execution. Without affecting the original function. This needs attention when we disassemble. Normal assembly code is flat, smooth and empty, which is very comfortable at a glance, and disassembly produces high-level code very quickly < / strong >

Here is a simple assembly as an example, because we need a lot of code to encounter pipeline optimization. Let me simulate here

Normal assembly code instruction sequence

xor eax,eax
xor ebx,ebx
xor ecx,ecx
mov eax,1
add eax,2
mov ebx,eax
mov ecx,3

Disrupted assembly line

xor eax,eax
mov eax,1
xor ecx,ecx
add eax,2
mov ecx,3
xor ebx,ebx
mov ebx,eax

Assembly code is very simple.
Let’s focus on the disordered compilation
The code was flat until it was disrupted. After the disruption, it was found that many compilations were interspersed
such as

mov eax 1
xor ecx,ecx

When CPU executes mov eax, 1. XOR ECX can be executed directly. The advantage of ECX is that the downstream assembly does not depend on the previous line of assembly. The previous instruction is the next instruction and depends on the previous instruction. Then CPU will wait if it finds that you depend on the assembly of the previous line when executing the second line.

mov eax,1
add eax,2

The first line executes mov eax, 1 then the second line uses eax. When the second line is executed, you have to wait for the first line.
The advantage of interruption is that when I execute the first line, I can also execute the second line without affecting your result. It can also increase the speed.

Mathematical transformation optimization

< strong > Mathematical change Optimization: if the number of operations is meaningless. Then it will be optimized < / strong >

i=10;
b=11;
i=b+10;
i=b-0;
i=b*3;
i=i/3;

Then the above advanced code can be optimized directly

i=b;

Unreachable branch optimization

< strong > an unreachable branch means that the branch will never go. Then assembly code will not be generated. There is no meaning of existence < / strong >

a=10;
if(a==10)
{
xxxx
}
else
{
xxxx
}

As we know above, a is a constant. The value is 10 Then I’ll go. And else will never go. Then else will be optimized. Of course, there is certainly not much code in the actual situation. I won’t simply write a = 10 to judge.

Code extraction optimization

< strong > the so-called code outsourcing is to optimize when it is bad. There is no code generated inside the loop body to modify this variable. Will be optimized < / strong >

int x = xxx;
while(x>y/3)
{
xxxx....
x--;
}

There is no operation Y / 3 inside the loop body, so this value will be put outside and optimized to

t= y/3
while(x>t)
{
xxxx....
x---;
}

The T variable is likely to be changed into a register variable through the above optimization

Remove the optimization method

Code obfuscation and optimization are opposites. So learning optimization is convenient for us to better human flesh optimization confusion code
As mentioned above. The compiler has recognized that your program has no address, address, etc. So we think of a way not to let him optimize.
and