How does subtraction for unsigned integer work in languages that dont allow invalid lvalues?
Would it not be possible to support subtraction?
How does subtraction for unsigned integer work in languages that dont allow invalid lvalues?
Would it not be possible to support subtraction?
From what direction are you asking this question?
There are certainly languages that do not have unsigned integers at all; OCaml is one such. (I believe OCaml does have extension libraries with unsigned types.)
Assuming that your concern is underflow, the answer is frequenly that subtractive underflow is handled in the same manner as integer overflow; exactly how that is varies considerably from language to language. Even strongly typed languages will sometimes declare numeric overflows to have undefined behaviors; for example, they may guarantee that the result is of the same type as the operands and that usage of the result (for example) as an array index will never violate memory safety, but be perfectly happy to take 0 - 1 and yield 2^32-1.
I would say that the likely cases are (with examples where I know of them):
There are quite possibly other options that are not occurring to me right now.
Sure, I can make that gaurantee by compromising performance of all array accesses by apply a guard to all array accesses. It would nicer to gaurantee the range the index can ever have.
I guess I presumed strong typed languages promise to always have valid values for a particular type. If the āget-outā is always well, its invalid so who knows, it makes strong typing a bit Meh..
Iāve always thought there has to be some mileage in compilers doing ātype computationā for expressions and generate meta-types with limited domains. eg if I have an integer and do:
a & 64
Compiler could create a meta-type for this expression for which the only values are 0 and 64 - and leverage that when emitting code.
That is very common in modern compilers, but itās typically done at the optimization stage, not in the type system. There are certainly many languages that would allow you to express the type āexactly one or exactly 64ā, but in those languages it is typically a distinct type from any kind of integer that would be, for example, permitted to index an array.
Sure, but synthesising a type as the RTL is generated rather than waiting until you do a bunch of transformations on the tree would be better perhaps. I just like the idea that a conditional block of code where the conditional is āif (i >=0 && i <8)ā should make type of i an integer with a specific range. All the compilers Iāve ever used donāt do that.
Fortran is strongly typed but has no unsigned integers in its standard.
But the feature is regularly proposed, like in this proposal to the J3 comittee (2024-February-28):
https://j3-fortran.org/doc/year/24/24-116.txt
And it has been included in GFortran 15 as an experimental feature:
And there are long discussions about how it should function if included in the standard:
Fortran IV was good, After that it became like every other standard a mess after the PDP 11 came out, because now machines had char, short,long signed and unsigned data types. Then the X86 with near and far. Nothing seems backward comparable after that.
. You had REAL and INTEGER what more do you need. Character data was unsigned. A standard needs to unchanging and respect all earlier versions and punched card encodings is my view on the matter. People today seem to forget FORTRAN was designed for 36 bit signed magnitude machine. Unsigned would make not able to run on a IBM 7090.
The last time I studied programing was in 1980 on a IBM 1130.
ONE WORD INTEGERS was a non standard option.
One way to make a type that is either 0 or 64 is to use an enum:
You can manually assign the constants values.
Pascal? Strongly typed but not inherently unsigned unless something like:
Soā¦
var foo 0 ⦠100 ;
then
foo := 1 ;
foo := foo - 1 ; (* OK )
foo := foo -1 ; ( Run-time error? *)
The 2nd subtraction ought to throw a run-time error. (Although itās been decades since I last used Pascal).
But to do that it needs to run on a system that can handle subtraction of whatever the native type is, then do a compareā¦
-Gordon
Try checking the other thread on Edison, they seem to be bit more modern than Pascal.
A piece missing here is: what does āunsigned arithmeticā mean?
In C it has a specific meaning: modulo 2^n arithmetic with integers in the range 0..(2^n)-1. If so, emulating that in a language like Pascal or Algol 60 where there are only (signed) integers is a bit tricky but certainly doable. Presumably the emulation would use signed integers whose representation matches the unsigned value in question.
If you have a language where overflow is defined to wrap modulo 2^n, the job is nearly trivial (many operations are identical for signed and unsigned arithmetic). If the language says that overflow either is an exception or undefined, it gets harder.
C is a particularly ugly language in this respect, because it defines unsigned to wrap silently while signed overflow is undefined. Many program bugs result from this design errorā¦
(You may know this, but for those who donātā¦) This is specifically because C does not take a position on sign representation, and choosing a specific arithmetic signed over/underflow behavior would privilege whatever representationās behavior was chosen.
Good point, but treating unsigned the way C does depends on a computer that does arithmetic that way, which either means a 2ās complement machine, or a machine that has separate unsigned arithmetic instructions.
For example, you canāt (easily) do C unsigned arithmetic on a CDC 6600, or an EL-X8, because oneās complement machines canāt easily treat UINT_MAX and 0 differently.
I suppose you could hack it by having unsigned ints limited to the same range as positive signed ints, i.e., donāt use the top bit in the word.
I have very little experience with oneās complement machines, but those instruction sets I am familiar with have separate signed and unsigned arithmetic operations. I agree that if a machine did not have this, Cās ordained behavior would be problematic. Iām guessing that by the time C was standardized, oneās complement machines without unsigned arithmetic were simply unusual enough that no one barked? Or perhaps there was too much extant C code that depended on overflow behaviors for pointers. The fact that the PDP-11, Interdata 7/32 and 8/32, and VAX were all twoās complement may have sealed this fate.
Thatās interesting. The two machines I mentioned both have only signed arithmetic instructions (60 and 27 bits, respectively). There are some others I donāt know well (PDP-1, ARRA II, ARMAC); I donāt think those have two sets of integer instructions either.
C is later than a bunch of other high level languages (such as ALGOL), and it probably arrived around the time that oneās complement machines were no longer seen much. Perhaps enough to explain why C wanted to be agnostic, but uncommon enough that the C rule of unsigned arithmetic (or even the notion that there is such a thing) was untenable.
Another point may be that the PDP-11 explicitly accommodates signed and unsigned operations, not so much for the arithmetic (that, where it matters like div and mul, is signed) but in the conditional branches. And the address space of the PDP-11 is small enough that learning to use unsigned branches when operating on pointers is a lesson programmers learned early and often.
There are a bunch of other representational issues beyond the one or twos complement game that made signed maths hairy on the older machines. Several systems for example had a 16bit word with a sign bit at the top. However the signed maths double word instructions were effectively 31 bit and ignored the āsignā bit of the lower word. Some of the mainframes were even stranger and had different instructions for loading signed and unsigned values let alone manipulating them.
Then of course there is hardware that faults on overflows which was often (IMHO rightfully) seen as the sane right and proper thing to do to avoid programmer errors producing rubbish.
A lot of older C is built around the reality of hardware at the time. Things like the rules on valid pointers emerge from segmented or object based architectures were merely creating a pointer to an invalid object is an offence in the eyes of the CPU.
Thereās also a lot of stuff in C that people misunderstand entirely (like pointers) because the assume that a pointer is a memory address whereas itās actually an object, offset pair (or for C functions just a handle - which really matters if you have overlays).
I never thought about overlays, with C. Unix with the PDP 11 MMU let one use the full memory of the 11 for user programs, a important feature for the time. That may have made the 11 stand out compared to the other 16 bit computers that had mmu late in product development.
The fact the UNIX had modern file system, rather than relying on cards and magnetic tape let C become a
more versatile language as compared to Pascal that often compiled just to p-code.