PARSING question white space

I want include white space as part of the symbol. Do any parsing routines handle that?
bool( if a cat, if a dog, … )

I recall Edinburgh Imp managed to have variables, etc. that could include spaces but checking the manual: “Except when used to terminate keywords or when between quotes, spaces are ignored by the compiler and may be used to improve readability”

I did make an attempt at allowing spaces in variable names when I wrote my BASIC interpreter by simply removing all spaces in the input line before parsing it, but it made other parts of the parser hard and I’ve been using a sortOfCamelCase mehod for a very long time now, so didn’t really mind.

-Gordon

For now I plan to use character stropping, the full stop to indicate a keyword.
Symbols will have spaces (only) trimmed to 1 in length and be part of the name.
I, I am, I am fish , all different variables.

I recall that Imp77 keywords were underlined. However as typing underlined words isn’t easy, then keywords were stropped with the % symbol (I’m sure @gtoal will correct me here)

Which was also the same symbol used in the text formatter to underline words, so if you fed source code through the formatter before printing then keywords were underlined in the printout. Handy.

-Gordon

1 Like

What do you need to do? As the overall functionality, that is.

I need software. The latest hardware is 36 bit word sized computer, with no character data type.
9 bit wide characters will be packed/unpacked into words, using software routines. BCPL or B
may be the only modern software I can use. But first I need to hack, a cross assembler together.
More details if I get the hardware working, this month.
Ben.

You are quite correct, and the concept was inherited from Algol 60 - at least one implementation of which also used ‘%’ as the stropping character. As Old Ben mentions BLISS used a ‘.’. A single character before text to signify underlining on the Flexowriter was the most common stropping convention, but a character before and after (as in other ‘ALGOL’ compilers) was maybe more common; and actual underlining was used too, especially when a flexowriter was involved. Stropped keywords could be run together so that %endofprogramme was identical to %end %of %programme as far as the compiler was concerned, once the text had gone through line reconstruction. Atlas Autocode, and the earlier IMP compilers, also allowed switching convention within the compilation, e.g. with the keywords: %upper %case %delimiters. Or once that was enabled, LOWER CASE DELIMITERS. I can’t remember how % stropping was turned back on, if it was. Properly underlined source code using the full set of Algol 60 symbols looks quite good, better than modern listings, although a few of the symbols still had to be approximated on the Flexowriter, using overstriking (such as | and * for ↑ (up arrow). The Flexowriter used an underlined < for ≤ although it was not the same as the mathematical less-than-or-equal-to (⩽) as the two lines that represent the equals part should both be sloping in parallel, so it’s rather nice that Unicode supports the correct mathematical symbol nowadays! (although not everywhere - on my system it shows up correctly in html pages but not when editing from the console command-line).

2 Likes

Allowing significant spaces within variable names in a language with reserved keywords and no stropping is perfectly possible as long as you expect some variable names to be rejected because they contain reserved words. I suppose you could use a hard space rather than a real space and eliminate any parsing difficulty at the expense of an editing nightmare instead. Stropping is your best bet. It is possible to write a parser that uses unstropped keywords and allows variables with the same names as keywords but there will inevitably be a case where ambiguity occurs - a statement that can be parsed two ways. Most language designs try to avoid that.

1 Like

When Atlas Autocode was ported from the kdf9 to the 32-bit ICL 4/75 they tested the cross compiler on the kdf9 which had 48 bit words, merely by masking off the extra bits. You can do the same with your 36 bit target by writing on a modern Intel system with 64-bit variables and masking down to 36.

2 Likes

Stropping is easy way to get error recovery. Reserved words only need to be looked at
the begining of a logical line, or logical operation like and. There are ample graphic characters
with US ASCII. Did IMP have a way packing symbolic names?

Hum, actually, at the moment, I can’t think of any construct where there are two adjacent identifiers. This seems to be rather rare to me. Meaning, as long as you exclude keywords either by a distinctive separator (personally, I liked Algol listings with keywords in apostrophes, like the PDP-8 implementation) or syntactically (as reserved words), it seems pretty safe to concatenate any strings into an identifier until you reach a terminal symbol described by the syntax.

(It wold be problematic with certain assemblers, where lists are separated by white-space, but I can’t think of any higher language that would rely on this exclusively.)

Edinburgh IMP77 (a variant of IMP) does allow spaces inside a variable name.
If you want to try this out goto the Github repository GitHub - siliconsam/imp2022: IMP77 compiler for Linux to download the source of an IMP compiler. To build and run look through the attached README. I’ve just released an update to the compiler suite. The compiler should be built on an Intel x86 Linux machine.
The problem with allowing spaces in variable names is when you make a mistake and add varying amounts of spaces for repeated instances of a variable then want to change the variable name!!

I’ve tested that.
But there are errors even with the sample files.
I sometimes get this error message (maybe an inc files of an imp file is missing)

**** Arrgh! Last chance event handler triggered from line=385
**** Triggered by error (event,sub,extra)=(9,2,2)
Error message is ‘Couldn’t open file. Error Code (2)’

further errors like from imp77 -c -Fc -Fs -Fi baggins.imp:

  • 13 not declared printstring("Value = ".^itos(value,0))
    Program contains 1 fault

From testreadstring.imp:

  • 16 not declared printstring( " String Size : ".^itos(w,0))
  • 16 not declared ; spaces( w - length(^itos(w,0)))
  • 16 not declared ; printstring( ^itos(y,0) )
  • 17 not declared printstring( " Max String Length : ".^itos(w-1,0))
  • 17 not declared ; spaces( w - length(^itos(w-1,0)))
  • 17 not declared ; printstring( ^itos(y-1,0) )
  • 18 not declared printstring( " Current String Length : ".^itos(length(s24),0))
  • 18 not declared ; spaces( w - length(^itos(length(s24),0)))
  • 18 not declared ; printstring( ^itos(length(s32),0) )
    Program contains 9 faults

It’s also very difficult. Like changing of bilbo.s
#1) remove the .rti compiler directives - what are these
#2) rename the mangled routines to something simple -?
#3) convert the routine entry and exit style to use: -??
#enter $0,$1

I ass-u-me that you are running on a Intel X86_64 Linux machine, with gcc-multilib installed.

The error **** Argh! Last chance … indicates that the IMP program expects a file name as a parameter on the command line.
There are 2 forms of IMP program
Type 1: %begin … %end
Type 2: %external %routine XXX %alias “__impmain” … %end

  1. %begin …%end %endoffile
    This then expects file names as parameters on the command line.
    e.g. myprog infile1.txt,infile2.txt=outfile1.lis,outfile2.txt
    The input files form a list (before the = sign), the output files form a list after.
    So for the example file names:
    infile1.txt is attached to inputstream(1), outfile1.lis is attached to outputstream(1) and so on.
  2. %external %routine
    This lets the programmer provide parameters on the command line as per usual programs.
    To obtain these command-line parameters see the stdperm.imp file which has lots of routine specifications (including routines to retrieve command line parameters)
    See tests/bilbo.imp as an example of using %external %routine … (and retrieving command line arguments). Build it using imp77 -Fc -Fs -Fi bilbo.imp
    As you might guess I’m a Tolkien fan.

Re your Arrgh problem, beware: imp77 -c -Fc -Fs -Fi baggins.imp will only create the baggins.o ELF object file ONLY, not an ELF executable.
The -c switch tells the compiler suite to stop once the ELF object file is created by pass3elf.

To create the baggins executable which needs the code generated from bilbo.imp use:
imp77link baggins bilbo
This will compile the baggins.imp and bilbo.imp and then kink them to form the baggins executable.
The first parameter to imp77link represents the “program” name, the following parameters each indicate the required IMP source modules (i.e. bilbo.imp)

Or you compile the baggins.imp and bilbo.imp seperately using the imp77 -c etc on each of the two IMP files then use the gcc command to link the two ELF files (not forgetting to reference libimp77.a)

Re bilbo.s:
The original bilbo.s was generated from bilbo.c (using -save-temps when invoking the gcc compiler on bilbo.c).
I then removed various gas directives and added the enter, leave instructions to make bilbo.s look as if the IMP compiler could generate the gas assembler source. It can’t at the moment; since IMP compiler doesn’t reference/generate the Global Offset Table construct. On the list of features to add so shareable/dynamic libraries can be generated. Currently only static libraries can be created (i.e. the IMP run-time libray libimp77.a in /usr/local/lib)

Hope this explanation helps!

Sorry, this does not help.
Yes, I have Intel X86_64 Linux machine and gcc-multilib and dos2unix.

imp77 -Fc -Fs -Fi bilbo.imp (or baggins.imp)
imp77 -c baggins.imp
etc still show several errors as shown mainly … not declared… update, see below

I now somehow have a bilbo.o and libbilbo.so (not sure if these are correct as I don’t know how to change bilbo.s). I can’t compile at all baggins.imp

I also don’t understand what name parameters to choose next to the imp files. I assume myprog should be an imp file.
myprog infile1.txt,infile2.txt=outfile1.lis,outfile2.txt.

Maybe the ld.i77.script is wrong. There’s one in the pass3 folder but one should create another one, but that is also unclear and didn’t work.
I’m also not sure when and how to reference libimp77.a
There are many windows files (libs) and .bat files. Do I have to replace the files with those from the linux folder?

I’ve tested other files, but maybe these are imp80.

Update: Sorry, I had the old version installed in /usr/bin and the new one as supposed in /usr/local/bin. I don’t have the not declared messages anymore.

Now I have just the Arrgh error message for
imp77 -Fc -Fs -Fi baggins.imp

So what input files are missing? Or do I have to invoke pass1-3 first?

Sorry my explanation didn’t help.
Look in /usr/local/lib to see if libimp77.a is present.
If you can’t access that folder then you will need to change the access permissions on the /usr/local folder tree to let you add, delete and execute files.
If you can access /usr/local/lib etc but libimp77.a is not present then copy again the imp202 repository as a zip file.
Unpack the zip file
To build the 3 compiler components.

  1. cd to pass3; make bootstrap
  2. cd to lib; make bootstrap
  3. cd to compiler; make bootstrap
    Check that pass1, pass2, pass3elf are in /usr/local/bin
    Check that stdperm.imp is in/usr/local/include
    Check that libimp77.a is in /usr/local/lib

Then you can cd to tests/imp2022-demo
To build baggins executable run imp77link baggins bilbo
To run baggins just run ./baggins

Let me know how it goes.
Best regards,
JD McMullin PhD

I already had successfully compiled and installed pass1-3.
In /usr/local/lib was also an old version of libimp77.a.
After fixing, I get again the not declared error messages.
Oh, I have to check for stdperm.imp…

Now I could compile it. (But have to check for other files etc).
Thanks!

I have linked it but have another issue
/usr/bin/ld: bilbo.o: in function __impmain': multiple definition of __impmain’; baggins.o:…
baggins.imp:(.text+0x0): first defined here
collect2: error: ld returned 1 exit status

I have deleted the files and tried again. But same error. I haven’t found impmain in the bilbo file.
I previously had the two compiled files. When running ./baggins I have an output (Baggins 1-63) and after that again an Arrgh error, but this time with

Error message is ‘Output stream ‘1’ not opened’
**** Likely cause of error: Missing/invalid command line parameter for output
file(s)

On bilbo I have this output/error
BILBO: argc=‘1’
BILBO: envc=‘63’

BILBO: Param(0)=‘./bilbo’

**** Arrgh! Last chance event handler triggered from line=17
**** Triggered by error (event,sub,extra)=(10,1,0)
Error message is ‘Incorrect command line parameter count. Argc=0’
**** Likely cause of error: Missing/invalid command line parameter(s)

Why do I have to link them anyway?

I’ve tested another sample imp77 -Fc -Fs -Fi teststring.imp

? 58 access length(longdata) = 0
*120 form print real( pi ^)
Program contains 1 fault

I must stop calling files in different folders by the same name bilbo.imp
I need to check your problem with test/bilbo imp when I get back home.
Will reply when car has new tyre fitted.

Back in my computer room.
If we are talking about imp2022/tests/baggins.imp then the program expects an output stream to be available.
So to run baggins you need to specify an output file.
./baggins =myoutput.txt
No file list provided before the = since no input stream is required.
myoutput.txt after the = specifies the file attached to the only output stream.

If we are talikng about bilbo.imp, baggins.imp in the imp2022/tests/imp2022-demo folder.
bilbo.imp is NOT a program but a collection of routines and so will NOT reference the __impmain ELF symbol.
However baggins.imp is meant to be a program that happens to reference the routines held in bilbo.imp.
One way to compile and build the baggins program whilst also compiling biulbo.imp then run
imp77link baggins bilbo.

To compile baggins.imp and bilbo.imp seperately then
imp77 -c baggins.imp
imp77 -c bilbo.imp
These 2 commands will create the ELF object files for baggins.o and bilbo.o BUT you need to link
baggins.o bilbo.o together with the IMP run-time library.
So run
gcc -m32 -no-pie -o baggins baggins.o bilbo.o /usr/local/lib/libimp77.a -lm -lc -T /usr/local/bin/ld.i77.script
You can then run the baggins executable as: ./baggins
(no input or output streams need to have an associated file)

Another observation; The scripts used to compile IMP source are in the /usr/local/bin folder
are installed from the imp2022/pass3 folder when building pass3elf.
These scripts have been “updated”.
Check that your imp77 , imp77link , ld.i77.script files in /usr/local/bin/ and imp2022/pass3/ match.
(should do if you have downloaded the latest copy of imp2022 from Github and rebuilt the compiler suite.)

The stdperm.imp file does NOT need to be compiled. It is a default include file used by pass1 to indicate the routines available in the run-time library (all source in imp2022/lib)

There are extra tools created to analyse .ibj and .ibj files but they are written in FreePascal so if you want to play with them:

  1. sudo apt install fpc this will install FreePascal in Ubuntu, Debian
    Assuming you also have a valid IMP compiler suite installed.
  2. In the imp2022/tools/ibj run make install to install the .ibj tools along with slimibj (written in IMP)
  3. In the imp2022/tools/icd run make install to install the .icd tools (all in FreePascal)
2 Likes