How is your knowledge of i80x86 segments ? Would you like to
implement:
void function (int i) __attribute__ (( farcall (xcodeseg) )) {}

It should produce a "fcallw" assembly instruction instead of a "calll"
when you call this function, and should finish the assembly body of
the function with "lretw" or "lretw n" when compiled with "-mrtd".
"xcodeseg" has to be a unsigned or unsigned short - it could be
optimised if it is constant but should be able to handle a variable
(or a C "constant" - known at execution time). Attribute "farcall"
alone could say "fcallw $cstseg,$cstoff" with the address of
cstseg (in code segment) stored in ".reloc" segment for loader
processing.
Note: how about "push %cs ; callw" and "retf" everywhere...

In a perfect world, a warning could be generated if such function is
not a leaf function (or calls something else than a "farcall" function)
- because then my software would crash if the function called is a
hidden memcpy or memset ! But I would still like to be authorised to
call static functions in the right segment.
Are you interested ? I'll test the GCC patch !

Also, how about to create an attribute to change the return code
of a function - so that the function can be called as a standart
one or as a far one: if the return address is < 0x10000, a "retl"
is done, else a "lretw" is done. Something like, after pops:
	  cmpw $0,2(%esp)
	  jne  1f
	  retl
	  1:
	  lretw
Would be usefull for malloc(), free(), and should work with "-mrtd",
and with or without -fomit-frame-pointer.

Maybe the best way would be to make changes to GCC to be able to
costumise the entry and exit code when calling a function - would handle
the previous case but also stack checking (using bound in between stack
allocation and the _first_ write in stack, not possible in GCC-3.4.1,
some stack variable written even when void fct(void) {bound_stack();{ }} )...

Really, I would need:
void atomic_function (long long updateval) __attribute__ ((atomic)) {
*longlongfieldptr = updateval;
}
So that the calling convention include save flags, set interrupt disabled,
and return with flags restore. Note that the stack format used may be
the interrupt stack format for the current processor, i.e. on 80x86 the
function ends with iret. Should also work for inline functions.

While talking of attributes, how about the generic "clean" attribute for
functions, which tells the compiler that only the first argument is a
non-const pointer, any other pointers are pointer to constant, and the
function do not have other side effect. For instance strcpy(), sprintf()...
Then, in this common code extract (with sprintf declared "clean"):
char dbgmsg[80];
sprintf(dbgmsg, "%d %u", str, val);
if (debug > 4)
    puts (dbgmsg)
the sprintf() is not called and dbgmsg[] is optimised away if (debug <= 3).

By the way, talking of pointers, we already have:
int *const int_const_pointer;
How about, for 16 bits pointers:
int *short int_short_pointer;
This can be used for x86 for these 16 pointers, but has a lot of uses
also on 32 bit RISC processors (PowerPC...) which uses two assembly
instructions to load a pointer: loading and using a short pointer is
the only atomic pointer access available then: think of the head and tail
pointer of a standard queue.
Note that PowerPC would define signed and unsigned pointer, linked to the
difference in @h and @ha in assembler. Unsigned short pointers have to
be used with a _base_ pointer and so loaded with "load immediate shifted r1,xxx"
for the base and " ori r1,r1,yyy " for the offset; Signed short pointer have
to be used with a "middle pointer" with "load immediate shifted r1,xxx" and
" load r2,yyy(r1) " (yyy is signed in this instruction).
On a 32 bits processor (sizeof(int *)=4) the short pointer (sizeof (int *short))
is 16 bits, so the pointed area is limited to 64 Kbytes. Most of the data
objects are less than 64Kb, even most of the functions. You can implement
a big switch by selecting an index in an array of short pointer to function, like:
typedef void (*short fct_t) (void);  fct_t array[10];
and the size of this array will not increase too much, its assembler use
will be very quick - maybe the only dereference possible is after adding
a (void *) - i.e. 32 bit base address - while keeping the type of the pointer,
and maybe that can be checked by the compiler.
On 64 bit pointer processor, the short pointer is an 32 bit address. Maybe
if this pointer is used alone the base address shall be zero, or maybe you
have to add (void *)0 tu use it. Maybe then we shall have a
"int *short short pointer_16bits;" like what I proposed
"short short int eight_bit_int;" without special "char" properties.
The entry of short constant values shall be:
const short a_short_const = 4321S; /* final S instead of L */
const unsigned short a_short_const = 4321US; /* final US instead of UL */
const short short a_byte_const = 0x10SS; /* final SS instead of L */
const unsigned short short a_byte_const = 0x10USS; /* final USS instead of UL */
It can be used in asm to set al/ax only: asm (" int 0x10 " : : "a" (0x0E20US));

If you enjoy patching GCC, how about an attribute for variables like:
"int variable_in_code_segment __attribute__ ((segment(cs)));" , valid
for cs, ds, es, fs, gs, ss ; default been ds (usual data segment).
While you are there, segment "io" could be defined to access I/O ports
transparently: "struct UART __attribute__ ((segment(io)));" - but this
one should be multiprocessor: a lot of processors have "Special
Function Registers", like the PPC (mtspr/mfspr). It should also be
able to treat the MSR of ia32 (rdmsr/wrmsr) using simple bit structures,
and reading ("rdpmc") performance counters like variables.

Then you can add on IA32 the "processor special" registers called
__segment_cs, __segment_ds, __segment_es, __segment_fs, __segment_gs,
(or which can be named with "explicit reg vars" extension 'asm ("gs");')
so that the user can do "__segment_fs = farptr >> 16;" - and that can
be optimised like any other variable: if the variable is set twice
with the same value, the second one is optimised away. Same for the
flags: "__flags.direction = 1;" to be optimised when set twice to 1,
or to be used like "if (__flags.overflow) {}".
Note that, after test,
'extern unsigned short __reg_fs __asm__ ("%fs");'
works partly (__reg_fs = __reg_cs; OK, __reg_fs = 0; not OK)
'register unsigned short __reg_fs __asm__ ("fs");'
is not accepted at all.
Note also that 'extern struct a_struct var __asm__ ("%cs:var");'
works also partly, up to the creation of a temporary pointer.

Just for completeness, I would also like:
unsigned globalvar1 : 8;	// 255 + 1 = 0
unsigned globalvar2 : 16;	// 0 - 1 = 0xFFFF
signed   globalvar3 : 64;
These reserved for later implementation of "saturating" integers:
unsigned globalvar1 : -8;	// 255 + 1 = 255
unsigned globalvar2 : -16;	// 0 - 1 = 0
int globalvar3 : -128 ... 127;	// maybe
So then:
typedef unsigned byte : 8;
typedef unsigned boolean : -1;

Note that we have unsigned long long but no unsigned short short,
which would be char but without the aliasing analysis related
to chars.... And maybe the opposite:
long char sixteen_bits_char = 'c';
long long char thirty_two_bits_char = 'c';
Note that type "short short short unsigned int" can be type "boolean"

On the same idea, you'll get short, short short and long long enums:
enum kbd_enum { kbd_unknown, kbd_dvorak, kbd_dvorak_ansi, kbd_us,
		   kbd_uk, kbd_azerty, kbd_qwertz, };
struct my_struct {
	short short enum kbd_enum kbd1; // i.e. "enum kbd_enum kbd1:8;"
	long long enum kbd_enum kbd2;	// i.e. "enum kbd_enum kbd1:64;"
	short enum kbd_enum kbd2;	// i.e. "enum kbd_enum kbd1:16;"
};
Note that you want to be able to take the address of an enum!

And what C++ wrongly calls "static const" in the C cleverness:
inline const int compile_flag = 1;
inline const struct { int fct1 : 1; } configuration = { };
so that we can have compile switches without using CPP,
like inline function have been made to remove CPP macros.
It enables "if (configuration.fct1) {}" and "switch (configuration.fct1)"
and "inline const int is_a_constant = __builtin_constant_p (tmp);"

Deeper inside, to describe that software contains code to unroll:
for (inline unsigned i = 0; i < 5; i++) { code_to_unroll_five_times; }

And the logical:
if (!typeof(bcopy))
memset (ptr, 0, nb);
if (!typeof(dirent.d_namelen))
...;
And the intuitive:
struct {...} a, b;
if (a != b)
...;
And (initialisation of variable size array):
void fct(unsigned variable_size) { char array[variable_size] = {}; ... }

Note also the string type:
struct { char a_string[20]; } a_var = { .a_string = "abc" };
The previous line compile, but not this one:
struct { char a_string[20]; } a_var = { .a_string = (1 == 1)? "abc" : "def" };
because the ?: operator result is logically of type "string", not (char *)

Also, what is the interpretation of:
void fct (void) { char str[] = "abc"; ... }
Is it: void fct (void) { char *str = "abc"; ... }
Or:    void fct (void) { char str[4] = "abc"; ... }

By the way, who has changed the non-ordering meanning of the C comma
in K&R to do it a sequence point of ANSI C? People have the semicolon
to put a sequence point, the comma is for other things.
int f1(void), f2(void), f3(int, int);
This is non-order in ANSI: f3 (f1(), f2());
This is order in ANSI and should return to "compiler dependant": f1(), f2();

Also this shall be written somewhere:
inline functions: you cannot take their address
static function: they do not need to respect calling conventions if nobody
take their address.

Note about inline functions: it is very good to inline function called once, but
sometimes you have to put them out of line because of the stack space the
function needs. GCC is not good enough to overlay stack space (like
temporary big buffers used to read disk sectors) used by different functions
which have just been inlined - that is boring if you do not have a too big
stack...

Note about optimisation of external function with variable number of
parameters: you cannot define their parameter constant, and so all
registers have to be written back to memory before calling, and to
be reloaded after the call - that is really inefficient for common
use of for instance printf:
  int result = fct1 (&param1, &param2);
  if (result != 0)
      printf ("Failed %d, param1.field1 = %d\n", result,  param1.field1);
  result = fct2 (&param1);
Proposed solution, add "const ..." to say all parameters are constant in:
int printf (const char *, const ...);
int sprintf (char *, const char *, const ...);

And the CPP:
#if !include ("string.h")
#include "strings.h"
#endif

BTW, how shall I write the following thing?
void fct (unsigned, int);
typeof (fct) stub (*) { asm ("code_for_stub"); }
or:
typeof (fct) stub (*) = { asm ("code_for_stub"); }

And how about reserving the word "alias" for variable declaration:
alias the->complex->structure.field myfield;
alias the->complex->structure.field *myptr;
instead of:
typeof (the->complex->structure.field) myfield =
the->complex->structure.field;
typeof (the->complex->structure.field) *myptr =
&the->complex->structure.field;

And if someone wants to implement the "farptr" type, based on
the type "complex short" but with different method to add.

And if someone would encode a way to describe to an asm ("" : :) that
the _content_ of a pointer is read or/and written, and not the pointer
itself being read or written. In fact it is already done, using the same
method as a function returning a big structure - but is often refused.
Imagine INT 0x10/ax=0x4F00 returns a table describing VESA info:

extern inline unsigned  __attribute__ ((const))
_VESA_getinfo (VESA_VbeInfoBlock *VESA_info) {
unsigned short status;

asm (" int $0x10 " : "=a" (status), "=D" (*VESA_info) : "a" (0x4F00));
return (status != 0x004F);
}

That is refused because of the '"=D" (*VESA_info)' but is
accepted with '"=g" (*VESA_info)' like:

extern inline unsigned  __attribute__ ((const))
_VESA_getinfo (VESA_VbeInfoBlock *VESA_info) {
unsigned short status;

asm (" lea %1,%%edi ; int $0x10 " : "=a" (status), "=g" (*VESA_info)
					: "a" (0x4F00) : "edi");
return (status != 0x004F);
}

And also a way to say for instance that %ah has to be set, not
necessarily the complete %eax register (code space saving).

And when a C switch is written, when optimising for size,
use a (base+offset) jump (so 16 bits per 'case:') instead
of a full word, 32 bits per 'case:', the jump table size
is also important...

And if someone would like to put another optimisation in GCC, so
that (at least when optimising with -Os) consecutive read/write/compare
to incremented pointers would be optimised:
struct { int a1, a2, a3; char b1, b2, b3, b4; } src;
struct { int a1, a2; char b1, b2, b3, b4; } dst;
dst.b4 = src.b4; dst.b3 = src.b3; dst.b2 = src.b2; dst.b1 = src.b1;
result in:  movl src,var2
and:
var2.a1 = var1.a1; var2.a2 = var1.a2
result in: rep movl ...
Then we would no more need memcpy(), memcmp() : just a for(;;) with
typed arguments. Moreover, there will be no need for functions to
search a char in a string (strchr/scasb), a short in a short array (scasw)
or a long in a long array (scasl)...
This scas[bwl] can be used to encode a switch with sparce cases.
Note for the optimisation course homework:
Compare optimisation for (when memset is standard out of line function):
void fct (char *ptr) { memset(ptr, 0, 10); ptr[3] = '\0'; }
and:
void fct (char *ptr) { int i = 10; while (i--) ptr[i] = 0; ptr[3] = '\0'; }

And if someone would remove the assumption that "rol $8,%%ax"
modify the upper bits of %%eax, so that we can do in C:
extern inline unsigned swab32 (unsigned val) {
union { unsigned l; unsigned short s; } tmp = { val };
tmp.s = (tmp.s >> 8) | (tmp.s << 8);
tmp.l = (tmp.l >> 16) | (tmp.l << 16);
tmp.s = (tmp.s >> 8) | (tmp.s << 8);
return tmp.l;
}
and that is compiled to either a "bswap %reg" or three "rol $,%mem"

And change the syntax of attributes to have a simple way to read/set
them - at least for stuctures:
typedef struct {
const attribute packed;		// or: const unsigned __packed__;
const attribute aligned = 4;	// or: const unsigned __aligned__ = 4;
unsigned char R,G,B;
} color_t;		// sizeof (color_t) == 3, even with attributes
... color_t color; if (color.packed) {} if (color.aligned >= 4) {}

And shall this work (E2FS inode i_mode):
	struct {
	    unsigned short execute	: 1;
	    unsigned short write	: 1;
	    unsigned short read		: 1;
	    } __attribute__ ((packed)) other, group, user;

But the dream stopped there, I still cannot mark as clobbered in asm()
the segment registers nor %ebp; and the bugs I reported on GCC-2.95.1
are not corrected on GCC-3.3 (.align 32 of strings & structures) ...

Just also checks needed:
GCC-3.0.4 can initialise a field in a structure using the "fieldname:"
and ".fieldname = " syntax, but not in a union? (only "fieldname:" then)
GCC-3.1+ try to use %dil 8 bit register on i386
(reload char in register %edi i.e. %dil by reloading %di or %edi ?).
GCC-3.1+ do reuse register written as output in asm ("": "=") without
reloading them (not renoticed in gcc-3.3).
GCC-3.1+ do not inline shit_handler() even if __attribute__((always_inline))
is declared?

It is quite difficult to know where to place __attribute__ (()) , like :
SEGMENT_EXTRA(hexval) static int FASTCALL hexval (char c) {return c - '0';}

if (sizeof (mystruct) != 45) __GCC_ERROR_MSG ("mystruct invalid size");

struct { unsigned long long addr : 48; ... };

const int array[44] = { [0] = 0, [1 ... 43] = [] - 1 };
current index                                           ^^

This macro is nice:
#define nbof(array)	(sizeof(array) / sizeof(array[0]))
how to get a warning/error when applied to a pointer?
Or how about to modify "C99 6.7.5.3" to have this function return 10:
unsigned fct (unsigned array[10])
	{ return sizeof(array) / sizeof(array[0]); }


This would be usefull, either use or not use some special
assembly instruction: stwbrx, sthbrx or bswap
unsigned little __attribute__ ((endian(little)));
unsigned long big __attribute__ ((endian(big)));
unsigned long long reallybig __attribute__ ((endian(half)));
Think of :
struct {
unsigned short nb1 : 4;
unsigned short nb2 : 12;
} __attribute__ ((endian(big))) ;
Compared to:
struct {
unsigned short nb2_msb : 4;
unsigned short nb1     : 4;
unsigned short nb2_lsb : 8;
} __attribute__ ((endian(little))) ;

I would find a use for:
void fct (void) asmprefix ("xcode_");
To  mean:
void fct (void) asm ("xcode_fct");
Because doing a cut & paste and forgetting to replace the name
in quotes leads to difficult to find bugs.
void fct (void) asm ("xcode" ## __FUNCTION__); doesn't work
Note that for my use, it is maybe possible to find a way
to prefix all functions in one segment, and their prototype,
something like "namespace" in C++ - to be checked.

To simplify the EXTRASEG_STUB() and EXTRASEG_REVERSE_STUB()
and for them to work without -fomit-frame-pointer,
I would need __builtin_return_address_address () or whatever
name to _change_ the return address...

I also came to a point where I wanted:
const struct {
char *thename;
} thevariable = {
.thename = __VARIABLE__,
};
i.e. the same way as __FUNCTION__ or __func__[] for functions.

How about, like typeof:
unsigned u; unsigned short us; unsigned char uc; enum {a = 0,b,c} ue;
typemax(u) == 0xFFFFFFFF,
typemax(us) == 0xFFFF
typemax(uc) == 0xFF
typemax(ue) == 2
typemin(u) == 0

It is nice to do:
struct { unsigned a, b; } var = { .a = fct(), .b = 0 };
but how about:
struct { unsigned a, b; } var = { .a = fct(), .b = .a / 2 };
another example:
const struct { short quote, quote2;} us_kbd = { .quote = 0x2827, .quote2 = .quote };

Just to continue, in a context of "ld --gc-section", it would be nice to be able
to use the C: "void fct (void); ... { ; &fct; }" to declare not only to keep a pointer
to the function (so non inline) but also to mark it used so KEEP() the function in
the final link from C.

What about GCC declaring local symbols to describe the size of the function:
L_sizeof_##FUNCTION = . - FUNCTION
The maximum local stack space taken by this function:
L_stacksize_##FUNCTION = 0xFFFFFFFF if calloc() used
The cumulative stack space (including static function calls)
L_cumulative_stacksize_##FUNCTION = 0xFFFFFFFF if calls external functions.
They could be taken back into C with an asm("").

For binutils, my christmas wish list is smaller:
- modify objcopy to be able to use stdout as output, like "-c" for gzip:
-c --stdout      write on standard output, keep original files unchanged
- have "gas" to recognise the pseudo op " .size a_symbol_or_function " in ELF,
without the comma to mean "equal the size of the symbol", to be used as:
asm (" sizeof_function_fct = .size fct \n"); extern char sizeof_function_fct[];
(maybe the ".size fct,.-fct" will appear before or after its use in the file)
(Same for align?)
- be able to set/change the name of the default segments in assembler,
i.e. rename the output section name used with ".data", ".rodata" and ".code"
early in the assembler file - for instance by using the following:
asm (" .data = .data16 \n .rodata = .rodata16 \n .code = .code16 \n");
(note: try it right now for fun)
It is a lot easier to do that in the assembler file, because in the linker
file, the line " *(.text) " match everything even if later you get
" realmode(.text) " in another subsection. Treating differently a file
in Makefile just because its sections goes elsewhere is difficult (i.e.
renaming the section by objcopy).
- Be able to have a small number of exception in "NOCROSSREFS (.text .xcode);"
for instance in my case a variable in the .text section: gujin_param.
Maybe accept a symbol defined in the linker file to still crossref-ed?
- It seems a small long standing bug when as -al , some lines do not appear
in the .lis files.
- TODO: understand the use of STARTUP in the linker file.
understand KEEP() is for sections and EXTERN() is for symbols
		 in the linker file?
- A way to access the linker SIZEOF(), BASE(), ALIGN() of ld sections in C,
so that we no more use and define _end, _edata, _etext but something
predefined like __startof__code, __alignof__code, __baseof__rodata
automagically with whatever ELF section used in C ?
- a way to tell to as that it shall say to the linker to keep a symbol
even in an "ld --gc-section" build?


To enable automatic generation of the .h file from the .c file (i.e. generate
a prototype for all non static function, output typedefs for their parameter
and for all non static variable) - it is needed to be able to tag a typedef
to say if it is exported or not (and so shall be written in the output .h file).
Maybe "extern typedef struct {} extern_type;" ?
Or "export typedef struct {} extern_type;" to be also able to export inline fcts?

A way to find unused #define, never used inline functions or types - for
instance to isolate quickly a function which trigger a bug in GCC and
all its dependancies - but nothing unneeded?

