How to generate and run native code dynamically?

Not sure about linux, but this works on x86/windows. Update: http://codepad.org/sQoF6kR8 #include <stdio.h> #include <windows.h> typedef unsigned char byte; int arg1; int arg2; int res1; typedef void (*pfunc)(void); union funcptr { pfunc x; byte* y; }; int main( void ) { byte* buf = (byte*)VirtualAllocEx( GetCurrentProcess(), 0, 1<<16, MEM_COMMIT, PAGE_EXECUTE_READWRITE ); if( buf==0 ) return …

Read more

Can modern x86 hardware not store a single byte to memory?

TL:DR: On every modern ISA that has byte-store instructions (including x86), they’re atomic and don’t disturb surrounding bytes. (I’m not aware of any older ISAs where byte-store instructions could “invent writes” to neighbouring bytes either.) The actual implementation mechanism (in non-x86 CPUs) is sometimes an internal RMW cycle to modify a whole word in a …

Read more

Why is the size of L1 cache smaller than that of the L2 cache in most of the processors?

L1 is very tightly coupled to the CPU core, and is accessed on every memory access (very frequent). Thus, it needs to return the data really fast (usually within on clock cycle). Latency and throughput (bandwidth) are both performance-critical for L1 data cache. (e.g. four cycle latency, and supporting two reads and one write by …

Read more

How can I compile to assembly with gcc

I suggest also using -fverbose-asm because then the generated assembler has some generated comments which “explains” the code. For example: gcc -S -fverbose-asm -O2 foo.c would generate in foo.s (with some comments) the assembler code produced by compiling foo.c And to understand what the GCC optimizations are doing one could even try -fdump-tree-all (but this …

Read more