B programming language?

Computerphile:
Original Hello World in “B” Programming Language

4 Likes

Watching it now… Of-course The first Hello World was in BCPL :slight_smile:
which somewhat pre-dates B…

GET "libhdr"

LET start () = VALOF
{
  writes ("Hello, World*n")
  RESULTIS 0
}

No need to write writes as it’s in the standard library, however if it were to be written in the code it would look like:

AND writes (s) BE
{
  FOR i = 1 TO s%0 DO
    wrch (s%i)
}

the first byte in a string (s%0) is it’s length.

I’ve been doing far too much BCPL programming in recent, er, years…

-Gordon

4 Likes

What would C and B be like if the PDP 11 had used the carry flag for odd/even byte of a word
and used word addressing as normal for the day?

Had C followed that way, we would have strings, short strings, and long strings. And far fewer security problems.

I think string handling is always going to have issues unless the language itself can manage it independent of what the user does. The language forces the string manipulation for you - usually at the cost of slower speed and added size.

BCPL (and B) is effectively typeless - a convention was adopted to make the first byte of a string its length but that could have been the first 16-bit value, or 32-bit value - but where do you declare that strings ‘type’… It may be argued that COBOL has it right, but again, at what cost…

As an aside, it’s interesting to note that the Linux kernel has recently eliminated all instances of the strncpy() function as it’s been identified as a source of issues…

-Gordon

3 Likes

A byte count or null byte just marks a simple string end. A true string data type also needs the character array memory size.

Very true, both allocation/capacity and used count are needed.

Today one might also need character code type.ASCII never seemed to be a fixed standard, a new subset
every few years.

That way lies madness, there are dozens, if not hundreds of legacy character sets, some with wild semantics (take the dozens of national EBCDICs, or the Asian ones, especially the CJK). For a modern design, I would go for ASCII (eighth bit allowed but ignored), and UTF-8, that’s it. Everything else is legacy, or app-internal.

2 Likes

B not necessarily. The Honeywell L66 B I am fairly sure used C style strings. It was however a lot more advanced (and nicer) than the original B on the PDP/11 and had a lot of really cool features including some that C missed out on like ranges in switch() statements and all functions being vararg with number of args known. That meant you had functions like

concat(buffer, string1, string2, …);

It also had stdio, or something very close to it and an optional preprocessor. One peculiar quirk it had was that there was no sprintf instead you could open a string like a file, which is actually much more flexible but more annoying to use

For the curious

and there are other tools documented there too including BOFF (The B Obscure Feature Finder) ie debugger.

1 Like

Here is a reference manual: B Programming Language Manuals : University of Waterloo : Free Download, Borrow, and Streaming : Internet Archive

1 Like

Thanks Lars - I think (edit: but I’m wrong - see below) there’s a plain HTML version of that, or something very like in, following links from Alan’s link. See The B Tutorial Guide in Index for “expl b”

That tutorial is something else, and it’s based on Kernighan’s tutorial. HTML btut.ms and scanned A Tutorial Introduction To The Language B : Brian W. Kernighan : Free Download, Borrow, and Streaming : Internet Archive

1 Like

I can see the resemblance to C. I was expecting it to look more like BCPL. I’d read Thompson talk about it as a “threaded interpreter,” like Angelo did. Hearing that term gets one thinking of spawning off processes, but that wasn’t what happened. It was interesting to hear Angelo reference Forth re. that, because I’ve looked at how Forth operates, and while I can see why the literature for Forth uses the terms “compile” and “interpret,” it feels like exaggerating, because all the Forth interpreter does is follow a trail of addresses down to primitives, in depth-first order, and executes machine code inside of each of the primitives, which was previously produced by an assembler. So, you could say the interpreter is more or less traversing a parse tree that’s been incrementally produced by the compiler. The compiler simply translates a word to its code-field address, to link into a “trail” (since each word’s executable code is largely made up of these addresses) that the interpreter will follow. There is no decoding necessary (the compiler has already done it), because the addresses are plugged right into a JSR machine instruction.

Looking this up, it looks like B worked the same way, except by Angelo’s description, it probably didn’t use an incremental compiler.

1 Like

There is a compiler for B: GitHub - sergev/blang: Compiler for the B language · GitHub

1 Like

There’s a whole book on this concept: Threaded Interpretive Languages (as mentioned in the references of the Wikipedia article)

Nice collection of supporting docs in that repo, thanks @Serge_V!

From S C Johnson’s reference

B works tolerably well on the H6070, which is a word addressable machine; when using a byte addressable machine such as the IBM 360/370 models or the PDP-11, B seems less attractive. A successor language, C, is being developed which allows most of the advantages of B on byte addressable machines, as well as a structure capability. While the case for C on the H6070 is not as strong as it is on byte addressable machines, the structure and character manipulation capabilities make it likely that C will eventually appear on the H6070.

3 Likes

Much of Forth terminology is slightly different from mainstream usage, but I don’t think the use of “compile” here is unfair. The Forth compiler is translating from a higher-level form to a lower-level form. Forth people are aware that threaded code isn’t running natively; there is the term “inner interpreter” for executing threaded code, as opposed to the “outer interpreter” or “text interpreter” that is processing the source code.

Essentially, yes. The simplicity of this process is considered a feature, not a bug.

1 Like

… because the addresses are plugged right into a JSR machine instruction.

Correction: Not JSR, but a LOAD instruction. The addresses are used for reference to the code fields of other words, which refer to the parameter fields for them, which contain more addresses, which are then used as reference for a LOAD instruction in an iterative process, etc. When the interpreter reaches a primitive, it JUMPs to the first native machine instruction (and the machine code JUMPs back into the Forth runtime when it’s done, where the process of going down trails of code-field addresses continues).

1 Like

For threaded code, ITC or DTC can be used. That’s indirect threaded code or direct threaded code. See “Moving Forth, Part 1” Moving Forth: Part 1

There’s also subroutine threaded code. Here the inner interpreter is dispensed with, and instead each address is replaced with a subroutine call. Sometimes some primitives are inlined.

1 Like