Today I started learning x86 Assembly. I am using NASM and have a doubt here:


Code:
section .data
   hello:     db 'Hello world!',10    ; 'Hello world!' plus a linefeed character
   helloLen:  equ $-hello             ; Length of the 'Hello world!' string
                                      ; (I'll explain soon)

section .text
   global _start

_start:
   mov eax,4            ; The system call for write (sys_write)
   mov ebx,1            ; File descriptor 1 - standard output
   mov ecx,hello        ; Put the offset of hello in ecx
   mov edx,helloLen     ; helloLen is a constant, so we don't need to say
                        ; mov edx,[helloLen] to get it's actual value
   int 80h              ; Call the kernel

   mov eax,1            ; The system call for exit (sys_exit)
   mov ebx,0            ; Exit with return code of 0 (no error)
   int 80h


On the second line:


Code:
hello:     db 'Hello world!',10    ; 'Hello world!' plus a linefeed character


I'm wondering why doesn't it need a 0 after like it does in Z80 Assembly. In fact, don't all strings need a 00 Byte in the end to indicate the end? Why is it not there? Thanks.

Also, I don't think many people can help me here, but it's worth a shot, right?
I don't know any x86 asm, but I guess that since 0x5 is the length of the string, there's no need to indicate where the string ends...
JosJuice wrote:
I don't know any x86 asm, but I guess that since 0x5 is the length of the string, there's no need to indicate where the string ends...


Thanks, that makes sense Smile
You don't "call the kernel", by the way; I get what you mean, but it's inaccurate. It would be better to say "invoke the system call interrupt" or "perform a system call". You're alerting the kernel that you want to make a system call, but it's actually the kernel that is choosing to handle the interrupt and subsequently serve it as a system call.

Code:

int 80h              ; Call the kernel
mov eax,1            ; The system call for exit (sys_exit)


I actually hadn't studied the code yet, I don't understand it, I didn't write the comments.

Does this mean I have to invoke the System Call to perform a system call? So I need to "call the kernel" to use SysCalls?
int 80h triggers interrupt vector 0x80, just as pressing ON or having a timer expire triggers an interrupt on the z80. The x86 has many more than a single interrupt, and its numbered interrupts are used for all kinds of things, like hardware interfacing, detecting keyboard and mouse events, setting up and asynchronously completing I/O reads and writes, etc. I was saying that the phrase "call the kernel" is wrong in the comments, or at least largely inaccurate. Firstly, it's not a call, and secondly, you're not calling the whole kernel, you're making a system call interrupt, and the kernel happens to be choosing to serve it for you.
Also, you should stay *far* away from interrupts. Always use a library to interface with the kernel (C standard library + a threading library is probably all you'll need there). Always. No exceptions.

You also should try and have your strings terminate with a null character *AND* you should manually keep track of their lengths. Never assume a string ends with a null character - that's the cause of 99% of buffer overflow exploits.
Again, David, "int 0x80" causes interrupt #0x80 to occur, which is a system call. You set up which system call you want and the arguments you want before you do the int 0x80. Kllrnohj, I don't think you read carefully; he's not creating an interrupt. Methinks that Kllrnohj hasn't delved deeply into Linux kernel internals. Smile
KermMartian wrote:
Again, David, "int 0x80" causes interrupt #0x80 to occur, which is a system call. You set up which system call you want and the arguments you want before you do the int 0x80. Kllrnohj, I don't think you read carefully; he's not creating an interrupt. Methinks that Kllrnohj hasn't delved deeply into Linux kernel internals. Smile


Uh... o.0

Kerm, you *might* want to re-read your own post as well as mine...
Kllrnohj: I think KermMartian thought you meant writing an interrupt handler, not using/triggering them. Writing interrupt handlers is indeed tricky and not for a newcomer, but using them (like int 80h) is not.

Either way, I think it's ok to use interrupts like this, as David is doing it to learn x86, and it's pretty well documented too.

David: I don't use x86 regularly, but I have used it a bit in the past, and I have fairly extensive experience with system calls on Linux. Let me explain what the code does:


Code:
section .data
   hello:     db 'Hello world!',10    ; 'Hello world!' plus a linefeed character
   0x5:  equ $-hello             ; Length of the 'Hello world!' string
                                      ; (I'll explain soon)

This defines a string "Hello world!" plus newline without a zero byte. It also defines a symbol (0x5, which I assume should be el oh el) equal to the length of the string. This should be pretty straightforward since you seem familiar with Z80 asm.


Code:
section .text
   global _start

_start:

This switches to the ".text" section. "Text" basically means "executable code". This also defines and exports the symbol _start. All programs start at _start (even C programs).


Code:
   mov eax,4            ; The system call for write (sys_write)
   mov ebx,1            ; File descriptor 1 - standard output
   mov ecx,hello        ; Put the offset of hello in ecx
   mov edx,0x5     ; 0x5 is a constant, so we don't need to say
                        ; mov edx,[0x5] to get it's actual value
   int 80h              ; Call the kernel

These lines call the "write" system call. Here is the C prototype for the write system call:


Code:
ssize_t write(int fd, const void *buf, size_t count);

(More details here).

Linux passes arguments to system calls via registers. The first argument (fd) is in ebx, second (buf) is in ecx, and third (count) is in edx. eax holds the system call number, which is 4 on x86.

The "int 80h", as mentioned already, triggers interrupt 0x80 (80 hex/128 decimal) which tells the kernel to run the system call. These 5 instructions are equivalent to the following C code:


Code:
  char hello[] = "Hello world!\n"; // this string is defined in the data section and is zero-terminated
  write(1, hello, strlen(hello)); // we don't use strlen() in the x86 code

The 1 is the file descriptor for standard output, which is already open for every regular process that you would run from the shell, and would point to the terminal (unless it's redirected to a pipe or a file).

When a syscall returns from the kernel, the eax register holds the result of the syscall (the return value). On success, the write syscall returns how many bytes it wrote (13 in this example). On error it returns a negative value. Take the negative of that negative value (to make it positive) and you'll have the error code as defined in /usr/include/errno.h (or more likely in /usr/include/asm-generic/errno-base.h or similar).


Code:
   mov eax,1            ; The system call for exit (sys_exit)
   mov ebx,0            ; Exit with return code of 0 (no error)
   int 80h

This calls exit(0); which exits the program. It's an error to let the program run off the end (same as in Z80) or to return (with the "ret" instruction). You must call exit() one way or another to terminate the program, or otherwise the program will very likely crash after it runs. Fortunately for C programmers, this is done automatically, even by returning from the main() function.

The C runtime does essentially this:

Code:
_start:
   call main
   mov ebx,eax  ; main's return value is in eax
   mov eax,1
   int 80h  ; exit(status);


Hope this helps. Smile
Kllrnohj wrote:
KermMartian wrote:
Again, David, "int 0x80" causes interrupt #0x80 to occur, which is a system call. You set up which system call you want and the arguments you want before you do the int 0x80. Kllrnohj, I don't think you read carefully; he's not creating an interrupt. Methinks that Kllrnohj hasn't delved deeply into Linux kernel internals. Smile


Uh... o.0

Kerm, you *might* want to re-read your own post as well as mine...
create == trigger? Fascinating.
KermMartian wrote:
create == trigger? Fascinating.


First, I never used the word "create" or "trigger", so I still have no clue where you got the idea that I suggested he should write an interrupt handler. Seriously, go actually read my post, for reals this time.

Second, creating an interrupt is the same as triggering an interrupt, although "creating an interrupt" is just a rather poor way of saying it. Creating an interrupt *HANDLER* is something totally different. Regardless, according to Intel the correct wording would be "call an interrupt", not trigger, cause, or create.
Kllrnohj wrote:
KermMartian wrote:
create == trigger? Fascinating.


First, I never used the word "create" or "trigger", so I still have no clue where you got the idea that I suggested he should write an interrupt handler. Seriously, go actually read my post, for reals this time.

Second, creating an interrupt is the same as triggering an interrupt, although "creating an interrupt" is just a rather poor way of saying it. Creating an interrupt *HANDLER* is something totally different. Regardless, according to Intel the correct wording would be "call an interrupt", not trigger, cause, or create.

How about "generating" an interrupt? That seems to be a fairly common way to word it which isn't ambiguous.
christop wrote:
How about "generating" an interrupt? That seems to be a fairly common way to word it which isn't ambiguous.


How about "call" - the word used in the documentation from the people who invented the thing? Int calls an interrupt. Yup, that seems like a good idea, let's go with "call". Razz
Just watch as the drama unfolds and the denials get neck high as the two senior programmers battle it out over interrupts! Next week, only on Bravo.

But more seriously, cool to see you learning some x86 David Smile. Any ideas on how you'll apply it to a project? Or perhaps you're learning it just to grow in programming skill (both are good)?
Kllrnohj wrote:
Second, creating an interrupt is the same as triggering an interrupt, although "creating an interrupt" is just a rather poor way of saying it. Creating an interrupt *HANDLER* is something totally different. Regardless, according to Intel the correct wording would be "call an interrupt", not trigger, cause, or create.
Arguing semantics? You know what I meant, you're just trying to backtrack past the fact that you didn't read carefully or didn't know what you were talking about, but heaven forbid you ever admit you might actually have been wrong about something. Cool

The fact that you said:
Quote:
Also, you should stay *far* away from interrupts. Always use a library to interface with the kernel (C standard library + a threading library is probably all you'll need there). Always. No exceptions.
made it clear you thought he was trying to write an interrupt for the purposes of some kind of threading, since you singled out threading instead of stdlib, math, or anything else.
KermMartian wrote:
Arguing semantics? You know what I meant, you're just trying to backtrack past the fact that you didn't read carefully or didn't know what you were talking about, but heaven forbid you ever admit you might actually have been wrong about something. Cool


I will admit I'm wrong when I am actually wrong. I'm not backtracking at all, and unlike you I did read carefully.

Quote:
The fact that you said:
Quote:
Also, you should stay *far* away from interrupts. Always use a library to interface with the kernel (C standard library + a threading library is probably all you'll need there). Always. No exceptions.
made it clear you thought he was trying to write an interrupt for the purposes of some kind of threading, since you singled out threading instead of stdlib, math, or anything else.


Kerm, are you trolling me? Pretty sure you're trolling.

I never thought he was trying to write an interrupt handler nor was I suggesting he should - that's just stupid. Even with threading you don't write interrupt handlers. It isn't even possible for user land to write interrupt handlers, as to register them with the IDT requires you to be in ring 0.

/me thinks Kerm needs to go brush up on his x86 systems architecture. Rolling Eyes

@OP: If you are serious about learning/writing x86, definitely download (and read) Intel's programming guides and instruction set references: http://www.intel.com/products/processor/manuals/ (Kerm, clearly you need to do this, too Razz )
Christopher, thanks a lot and all of you others too. I wanted to have some contact with x86 but this looks quite complex.

I tried to print two strings with the following code:


Code:
section .data
   hello:     db 'Hello World!',10    ; 'Hello world!' plus a linefeed character
   helloLen:  equ $-hello             ; Length of the 'Hello world!' string
   
   bye:       db 'Goodbye!',10         ; 'Goodbye' plus a linefeed character
   byeLen:    equ $-bye               ; Length of 'Goodbye' string
   
section .text
   global _start

_start:
   mov eax,4            ; The system call for write (sys_write)
   mov ebx,1            ; File descriptor 1 - standard output
   
   mov ecx,hello        ; Put the offset of hello in ecx
   mov edx,helloLen     ; Get length of string
   
    int 80h
    mov eax,1
   
    mov eax,4
    mov ebx,1
   
    mov ecx,bye
    mov edx,byeLen
   
   int 80h              ; Prepare System Calls
   mov eax,1            ; The system call for exit (sys_exit)
   
   mov ebx,0            ; Exit with return code of 0 (no error)
   int 80h


And it worked.

However, I am a bit dubious about how well-written this code is, especially because the following one does not work:


Code:
section .data
   hello:     db 'Hello World!',10    ; 'Hello world!' plus a linefeed character
   helloLen:  equ $-hello             ; Length of the 'Hello world!' string
   
   bye:       db 'Goodbye!',10         ; 'Goodbye' plus a linefeed character
   byeLen:    equ $-bye               ; Length of 'Goodbye' string
   
section .text
   global _start

_start:
   mov eax,4            ; The system call for write (sys_write)
   mov ebx,1            ; File descriptor 1 - standard output
   
   mov ecx,hello        ; Put the offset of hello in ecx
   mov edx,helloLen     ; Get length of string

    mov eax,1
   
    mov eax,4
    mov ebx,1
   
    mov ecx,bye
    mov edx,byeLen
   
   int 80h              ; Prepare System Calls
   mov eax,1            ; The system call for exit (sys_exit)
   
   mov ebx,0            ; Exit with return code of 0 (no error)
   int 80h


Thanks Smile
You mean the second one didn't print "Hello World" like the first one? That should be expected, since removing the "Int 80h" on the fifth line of _Start means the code never made the system call to print it.
Qwerty.55 wrote:
You mean the second one didn't print "Hello World" like the first one? That should be expected, since removing the "Int 80h" on the fifth line of _Start means the code never made the system call to print it.


Indeed, I only get the "Goodbye". I understand why too because I never call the system calls.

I can't understand, though, why I have to call it for each time I need to use a syscall:


Code:
section .data
   hello:     db 'Hello World!',10    ; 'Hello world!' plus a linefeed character
   helloLen:  equ $-hello             ; Length of the 'Hello world!' string
   
   bye:       db 'Goodbye!',10         ; 'Goodbye' plus a linefeed character
   byeLen:    equ $-bye               ; Length of 'Goodbye' string
   
section .text
   global _start

_start:
   mov eax,4            ; The system call for write (sys_write)
   mov ebx,1            ; File descriptor 1 - standard output
   
   mov ecx,hello        ; Put the offset of hello in ecx
   mov edx,helloLen     ; Get length of string
   
    int 80h              ; Prepare System Calls
    mov eax,1
   
    mov eax,4
    mov ebx,1
   
    mov ecx,bye
    mov edx,byeLen
   
   mov eax,1            ; The system call for exit (sys_exit)
   
   mov ebx,0            ; Exit with return code of 0 (no error)
   int 80h


My problem here is, what is int80h? I kind of have a problem with the theory behind it.

I thought it "prepared" the system calls, but it also seems to print stuff right?

Also, what about registers edx and ecx, do they work like the arguments for write( in C? And after setting the "arguments" I need to call a syscall?

I'm a bit confused, but I think I can overcome this. Thanks
  
Register to Join the Conversation
Have your own thoughts to add to this or any other topic? Want to ask a question, offer a suggestion, share your own programs and projects, upload a file to the file archives, get help with calculator and computer programming, or simply chat with like-minded coders and tech and calculator enthusiasts via the site-wide AJAX SAX widget? Registration for a free Cemetech account only takes a minute.

» Go to Registration page
» Goto page 1, 2, 3, 4, 5  Next
» View previous topic :: View next topic  
Page 1 of 5
» All times are UTC - 5 Hours
 
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum

 

Advertisement