That's about 465 padding bytes and 13 header bytes. It's not that difficult to generate padding bytes. Headers are the ones that are difficult to make by hand (although I still do it quite often for certain formats).
Qwerty.55 wrote:
That's about 465 padding bytes and 13 header bytes. It's not that difficult to generate padding bytes. Headers are the ones that are difficult to make by hand (although I still do it quite often for certain formats).

It's more than 13 bytes for the header. The ELF header alone is larger than 13 bytes (I think the minimum ELF header is 52 bytes). The header in the executable that I created also contains quite a bit of symbol information and other stuff, though much of that can be stripped out with the "strip" command. After stripping the executable, I got a 280 byte file, so there's 233 bytes of padding/headers now.

Regardless of the exact header sizes, they are difficult to make by hand, as you said. (On the other hand, TI-8x executable headers are simple and can be created by hand relatively easily, but those programs are not x86 and can't run natively on a desktop OS.)
Hm, apparently my documentation on the .exe format is out of date then :p
Oh, I'm doing this in Linux using the ELF executable format. Windows uses a COFF executable format, which is probably smaller than ELF. So we're talking about two different things here. Smile
Ah, okay. Elf is completely different that Windows executables. Windows .exe's (COFF headers are only part of the .exe header format) are a real mess.
AND, for your convenience, a complete layout of the .EXE file's header Wink

http://support.microsoft.com/kb/65122
That's old Wink

I have the latest one, although Microsoft published it in .docx format...
ScoutDavid wrote:
Is that hex x86 or pure machine code (no language)?


There is no such thing as "pure machine code (no language)". Assembly and machine code are the same damn thing. Your desktop CPU's machine code *IS* x86 (possibly x86_64 as well - not to mention it is compatible with at least a dozen "flavors" of x86).

ScoutDavid wrote:
Then I have a question:

If I want to code a computer program using x86 in the form of hex, I'd use the compiler NASM or whatever and write it in Hex, right?

So what are .exe's? x86? I really don't get that.


You don't want to program in hex. Programming in hex is retarded. The only reason Z80 coders do it is because they want to feel like 1337 haxx0rz or some bullshit like that. If you really want to get down to the metal, program in assembly.

ScoutDavid wrote:
Ok I'll give it a try. How dangerous is it to code? Cos I'm afraid of screwing up PC Razz


It isn't dangerous at all. The code will either not run (bad format, etc...) or it will crash. It is pretty much impossible to screw up your PC without running the program as admin/root, and even then you would have to really be trying very hard to do so.
Kllrnohj wrote:
ScoutDavid wrote:
Then I have a question:

If I want to code a computer program using x86 in the form of hex, I'd use the compiler NASM or whatever and write it in Hex, right?

So what are .exe's? x86? I really don't get that.


You don't want to program in hex. Programming in hex is retarded. The only reason Z80 coders do it is because they want to feel like 1337 haxx0rz or some bullshit like that. If you really want to get down to the metal, program in assembly.

Or we want to program assembly on the go without access to a proper assembler. Though, that's become less of a problem recently with the introduction of Mimas. It can be very useful to know what bytes instructions are made of, though, if only for optimization's sake (and also for self-modifying code). I sometimes code small routines in the hex editor in calcsys if I want to test something very quickly. However, I do very much agree that it can become super-incredibly hard to maintain very large programs when coding the binary yourself. And of course, self-modifying code is not a good idea on PCs anyway due to caches and the like.
calc84maniac wrote:
Or we want to program assembly on the go without access to a proper assembler. Though, that's become less of a problem recently with the introduction of Mimas.


Fair enough, but also not valid on the PC.

Quote:
It can be very useful to know what bytes instructions are made of, though, if only for optimization's sake (and also for self-modifying code). I sometimes code small routines in the hex editor in calcsys if I want to test something very quickly. However, I do very much agree that it can become super-incredibly hard to maintain very large programs when coding the binary yourself. And of course, self-modifying code is not a good idea on PCs anyway due to caches and the like.


You can't do SMC on the PC to begin with, so also not valid.
Kllrnohj wrote:
You can't do SMC on the PC to begin with, so also not valid.

That's because the old instruction might already be loaded into the pipeline, right?
Kllrnohj wrote:


There is no such thing as "pure machine code (no language)". Assembly and machine code are the same damn thing. Your desktop CPU's machine code *IS* x86 (possibly x86_64 as well - not to mention it is compatible with at least a dozen "flavors" of x86).


How many times do I have to repeat this, Kllrnohj? They only *represent* the same thing. They are no more the same than C and Assembly are the same in that they specify particular sequences of machine operations. They are different *representations* for the same thing.
Qwerty.55 wrote:
Kllrnohj wrote:


There is no such thing as "pure machine code (no language)". Assembly and machine code are the same damn thing. Your desktop CPU's machine code *IS* x86 (possibly x86_64 as well - not to mention it is compatible with at least a dozen "flavors" of x86).


How many times do I have to repeat this, Kllrnohj? They only *represent* the same thing. They are no more the same than C and Assembly are the same in that they specify particular sequences of machine operations. They are different *representations* for the same thing.

Hmm... how about this definition of assembly language? Machine code written by a human.
calc84maniac wrote:
That's because the old instruction might already be loaded into the pipeline, right?


No, it's because the OS won't let you open a file for writing that it is executing from. Most kernels will also flag the executable portion of the loaded exe as read only in memory (enforced by the CPU).

Qwerty.55 wrote:
How many times do I have to repeat this, Kllrnohj? They only *represent* the same thing. They are no more the same than C and Assembly are the same in that they specify particular sequences of machine operations. They are different *representations* for the same thing.


No, they are the same thing. You can repeat it as often as you want, you are still wrong. C and assembly are two different things. Assembly and machine code are not, there is a 1:1 mapping between human readable assembly and machine readable assembly (aka, machine code).
Kllrnohj wrote:
calc84maniac wrote:
That's because the old instruction might already be loaded into the pipeline, right?


No, it's because the OS won't let you open a file for writing that it is executing from. Most kernels will also flag the executable portion of the loaded exe as read only in memory (enforced by the CPU).


When an executable file is loaded into RAM, a copy of it is stored in RAM. If a program wanted to make permanent chances to the copy on disk, it would have to open the file for writing, which, as you noted, Windows won't let you do these days. (I'm guessing it's because Windows can page fault from the executable for relevant portions of the program's address space.)

Also, as you noted, Windows likes to mark executable code as execute-only, but it doesn't have to. If the necessary information isn't available in the EXE file, Windows can't use that protection. This means that older programs often can rewrite the executable code in their address space. (Yes, programs these days usually have separate code and data sections, but you can make an EXE file that doesn't do that.) I'm not even going to touch the Data Execution Protection.

calc84maniac noted the caching issue. Yes, if a program changes its executable code in RAM, until the changed portion is flushed from the cache, the CPU will continue to execute the old code. Which means that you can write a program that tests the size of the instruction cache. (But be wary of OS-forced context changes.) Note that most CPUs have separate code and data caches. What happens if the instruction isn't in either? The CPU loads the instruction into the data cache and changes it. But what if the CPU loads the modified instruction into the instruction cache before it's written back to main RAM? Will the old or new instruction be executed? (Answer: It depends on the caching strategy. Also, most systems have unified L2 and L3 caches, so those can be treated like main RAM here.)
DrDnar wrote:
When an executable file is loaded into RAM, a copy of it is stored in RAM. If a program wanted to make permanent chances to the copy on disk, it would have to open the file for writing, which, as you noted, Windows won't let you do these days. (I'm guessing it's because Windows can page fault from the executable for relevant portions of the program's address space.)

Also, as you noted, Windows likes to mark executable code as execute-only, but it doesn't have to. If the necessary information isn't available in the EXE file, Windows can't use that protection. This means that older programs often can rewrite the executable code in their address space. (Yes, programs these days usually have separate code and data sections, but you can make an EXE file that doesn't do that.) I'm not even going to touch the Data Execution Protection.

calc84maniac noted the caching issue. Yes, if a program changes its executable code in RAM, until the changed portion is flushed from the cache, the CPU will continue to execute the old code. Which means that you can write a program that tests the size of the instruction cache. (But be wary of OS-forced context changes.) Note that most CPUs have separate code and data caches. What happens if the instruction isn't in either? The CPU loads the instruction into the data cache and changes it. But what if the CPU loads the modified instruction into the instruction cache before it's written back to main RAM? Will the old or new instruction be executed? (Answer: It depends on the caching strategy. Also, most systems have unified L2 and L3 caches, so those can be treated like main RAM here.)


Thanks for taking the time to explain all that to calc - I didn't feel like typing all that up Razz

You are also heavily focused on Windows, but Linux and others are similar in this regard except they tend to not support broken old code like Windows does, and thus SMC is even more impossible. Multi-core and shared caches also complicate things as you indicated. However, the L2 cache is usually *NOT* shared between cores. The L3, however, is (heck, on Sandy Bridge it is shared with the on-die GPU as well)
You've got a point. Similar concepts apply to Linux and Macs. I actually learned a lot of what I know about how the CPU handles things from a book on the Linux kernel.

Also, on the topic of Windows and broken code, Microsoft actually laments the amount of trouble they go through to support old software, because if a program doesn't work on a new operating system, they get blamed. There's a list, somewhere, of all the thousands of programs Microsoft has gone out of its way to ensure that they still work.
Ignoring the issues revolving around the cache, I believe SMC is possible in Linux. You have to copy your code (at least the SMC function) to the process's data section (on the heap or wherever) and then execute the code there. You probably have to run the mprotect() system call to make it executable before jumping/calling the SMC code.
DrDnar wrote:
Also, on the topic of Windows and broken code, Microsoft actually laments the amount of trouble they go through to support old software, because if a program doesn't work on a new operating system, they get blamed. There's a list, somewhere, of all the thousands of programs Microsoft has gone out of its way to ensure that they still work.


Oh, absolutely, Microsoft's backwards compatibility is frankly incredible. But that does have a price in that old busted stays supported.

christop wrote:
Ignoring the issues revolving around the cache, I believe SMC is possible in Linux. You have to copy your code (at least the SMC function) to the process's data section (on the heap or wherever) and then execute the code there. You probably have to run the mprotect() system call to make it executable before jumping/calling the SMC code.


There is no way to actually write it back, though, which is the whole problem. Yes, you can load executable code into the data section and execute there (heck, this is what a JIT does - let me tell you stack traces from Chrome's V8 JS engine is useless because of it), but you can't actually write it back out. Modifying it is also going to be troublesome, slow, or bug-tastic, depending on a number of factors.
Kllrnohj wrote:
DrDnar wrote:
Also, on the topic of Windows and broken code, Microsoft actually laments the amount of trouble they go through to support old software, because if a program doesn't work on a new operating system, they get blamed. There's a list, somewhere, of all the thousands of programs Microsoft has gone out of its way to ensure that they still work.


Oh, absolutely, Microsoft's backwards compatibility is frankly incredible. But that does have a price in that old busted stays supported.


Still, it's sometimes nice when the only program available to do that obscure algorithm you're too lazy to write yourself is Windows '98--

<pause>

--Oh the irony. Explorer.exe just crashed when I looked at the compatibility modes of a program.
  
Register to Join the Conversation
Have your own thoughts to add to this or any other topic? Want to ask a question, offer a suggestion, share your own programs and projects, upload a file to the file archives, get help with calculator and computer programming, or simply chat with like-minded coders and tech and calculator enthusiasts via the site-wide AJAX SAX widget? Registration for a free Cemetech account only takes a minute.

» Go to Registration page
Page 4 of 5
» All times are UTC - 5 Hours
 
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum

 

Advertisement