Friday, May 18, 2012

Using the stack to pass parameters between assembly functions

For a new assignment in AETC I have to pass parameters between assembly functions using the stack instead of registers.

I've made a simple test using again the power example (like in previuos posts: [1] and [2]).
In this case, an assembly function called pow_caller, which in turn calls the p_pow function, is being called from C.

This is the calling C code:
#include "stdlib.h"
#include "stdio.h"
#include "assert.h"

extern int pow_caller(int, int);

int main(void) 
  assert(1 == pow_caller(3, 0)); 
  assert(9 == pow_caller(3, 2));
  assert(25 == pow_caller(5, 2));   
  assert(36 == pow_caller(6, 2));   
  assert(27 == pow_caller(3, 3));      
  printf("All tests passed!\n");
  return EXIT_SUCCESS;
This code is calling the pow_caller function:
  push rbp      ; Store stack initial state
  mov  rbp, rsp   

  sub rsp, 8    ; We make space for p_pow result by
                ; subtracting 8 bytes to the stack 
                ; pointer (RSP)
  push rdi      ; We push the base (b) to the stack
  push rsi      ; We push the exponent (e) to the stack
  call p_pow    ; Now we call p_pow
  add rsp, 16   ; We make RSP point to the 
                ; position were the result is.
  pop rax       ; And move the result to rax so 
                ; that it's returned to the C program
  mov rsp, rbp  ; Restore stack initial state
  pop rbp

As you can see, pow_caller is passing the parameters (base and exponent) to p_pow using the stack.
Before passing the parameters it's also reserving some space (8 bytes) for the result of p_pow to be returned. That's what sub rsp, 8 is doing, (remember that the stack grows towards decreasing addresses).

Once p_pow finishes, we have to get the result from the stack. To do it we go to the place where the result is by moving the stack pointer RSP (add rsp, 16) and, once there, get the result from the stack moving it to the RAX register (pop rax), so that it's returned to the C program when pow_caller ends.
Before executing add rsp, 16, the result of p_pow was in memory at address rbp+16, but after executing add rsp, 16 it was at address rbp, so it can be reached just using pop.

It's very important to remark that the state of the stack (RSP and RBP) must be the same before and after the function to avoid funny bugs when you go back to the caller.

In pow_caller we've pushed the content of 3 registers (RBP, RDI and RSI) to the stack (24 bytes) and subtracted 8 bytes to store the result of p_pow, that's 32 bytes, so that, $rsp' = $rsp - 32, where $rsp is the content of RSP when entering p_pow_caller and $rsp' is its current value.
After p_pow finished, we added 16 bytes to RSP and then made two pops (another 16 bytes), so at the end $rsp' = $rsp again.
That last pop made also that $rbp' = $rbp again.

It only remains to see how p_pow accesses the stack to get the parameters and store the result.

This is the code of p_pow:
  push rbp
  mov  rbp, rsp    ; Store stack initial state
  push rax                   
  push rbx 
  push rcx
  push rdx  
  mov eax, 1       ; Initialize rax  

  ; We move the base (b) from the stack to ebx
  mov ebx, dword[rbp+24]  

  ; We move the exponent (e) from the stack to ecx
  mov ecx, dword[rbp+16]  

  cmp ecx, 0       ; if e==0 -> b^0=1, and we are done
  jle p_pow_end             
    mul ebx        ; eax*ebx=edx:eax
    dec ecx        ; ecx = ecx - 1
    jg  p_pow_loop ; If ecx > 0 it iterates again

  ; Move the result to the space reserved for it in the stack
  mov [rbp+32], rax      
  pop rdx          ; Restore stack initial state
  pop rcx
  pop rbx
  pop rax
  mov rsp, rbp
  pop rbp
Ok, so again the first thing we do is pushing the initial value of RBP (the value in p_pow_caller), so that we can get it back once the function has finished.

Then we copy in RBP the initial content of RSP+8 (because we did a pop). We do this because we want to be able to go on using the stack doing push and pop (which changes RSP), but we also want to be able to reach the parameters that are at certain distance in memory of the original RSP.
This explains why to get a parameter from the stack we use [rbp+distance].
In p_pow_caller we pushed the base first and then the exponent. Since the stack is a LIFO data structure, the exponent is closer to RBP than the base.
How far is it? Well, before pushing p_pow initial RBP content, the exponent was at a distance of 8 bytes from the caller's RSP and the base at a distance of 16. After pushing RBP, the distance is 8 bytes more: base at 28 and exponent at 16 bytes distance.
In that moment we stored the content of RSP inside RBP, so that later changes to RSP cannot change the distance to the parameters. That's the reason why we are able to grab the base and exponent by doing mov ebx, dword[rbp+24] and mov ecx, dword[rbp+16], respectively.

When we've computed the power, we store it in the place that we reserved for it before pushing the parameters (which is 8 bytes farther than the base) doing mov [rbp+32], rax

At the end of p_pow we restore the initial state of the stack (RBP and RSP) again before returning to p_pow_caller.

Sunday, May 6, 2012

Installing Go 1 in Ubuntu

I just followed the Gustavo Niemeyer's instructions in golang-nuts
sudo add-apt-repository ppa:gophers/go
sudo apt-get update
sudo apt-get install golang-stable
To check if the installation went ok, I run the following code
package main
import "fmt"
func main() {
 fmt.Printf("Hello, Go!\n")
doing this:
$ go run hello.go
Hello, Go!
And that's all.

Adding a Haskell Brush for SyntaxHighlighter in Blogger

Lately I've written several posts about my first steps in Haskell.
The Haskell syntax in my code samples was not being highlighted because SyntaxHighlighter does not have a brush for Haskell (see syntaxes supported here):
mulTable n = [[c * b | b <- xs] | c <- [0..n]] !! n
        xs = [0..10]
Googling a bit I found out that watashi had developed a custom brush for Haskell.
To use it in blogger, I edited the HTML of my blog by adding this line right below the place where the rest of the brush scripts were being loaded:
<script src='' 
Then to highlight the syntax of a Haskell code sample, I just did:
<pre class="haskell" name="code">
mulTable n = [[c * b | b &lt;- xs] | c &lt;- [0..n]] !! n
        xs = [0..10]
And this was the result:
mulTable n = [[c * b | b <- xs] | c <- [0..n]] !! n
        xs = [0..10]

Thursday, May 3, 2012

Playing with Haskell: Multiplication Tables

After reading I'm a list comprehension in Learn You a Haskell for Great Good! book, it came to my mind that I could generate multiplication tables as an initial exercise.
After a bit of trial and error, I managed to group the lists the way I wanted.
This is my clumsy solution:
mulTable n = [[c * b | b <- xs] | c <- [0..n]] !! n
        xs = [0..10]
And this is how it's used:
*Main> :l baby.hs 
[1 of 1] Compiling Main             ( baby.hs, interpreted )
Ok, modules loaded: Main.
*Main> mulTable 0
*Main> mulTable 1
*Main> mulTable 2
*Main> mulTable 7
*Main> mulTable 10
*Main> mulTable 12
*Main> mulTable 15
*Main> mulTable 20
@remosu has a simpler and better solution:
mulTable n = [c * n | c <- [1..10]]
and he did it also in python:
multable = lambda n: [c * n for c in range(1, 11)]

Tuesday, May 1, 2012

Interesting Talk: "Stephan T. Lavavej - Standard Template Library (STL), 6"

I've resumed watching Stephan T. Lavavej's Standard Template Library series at Channel 9.
This talk is an introduction to STL algorithms:
I like watching this while I have gedit and a console open to test things.