Does the future of “traditional” CPUs belong to RISC-V?
“Jim Keller on AI, RISC-V, Tenstorrent’s Move to Edge IP”
EE Times
Jun 9, 2023
(interview with Sally Ward-Foxton)
1:38
Keller: The surprising thing to me, on RISC-V especially, was – like, I believe
open source wins. Right? Once people switched to Linux nobody went back to NT.
Nobody went back to OS/360, right? It’s a one-way street. I think RISC-V over time
is going to win. And that’s because it’s a technology you can own, right?
You don’t have to license it. You don’t get licensing surprises. Like, you were
licensing it and all of a sudden the license fee went up. You could keep doing
something you want. . .
8:47
Keller: You asked, like, why are you being so open? Well one [answer] is: open means
that people can come talk to us and say ‘Could we do this or that?’ ‘Yeah, sure.’
As opposed to: you buy from one of the big companies. And they say, ‘Here’s
your allocation, here’s the terms, here’s the price’. And the software’s proprietary,
possibly encrypted, and comes with, let’s say, less support. And that’s a problem for
people. So that’s kind of the journey people are on. I’ll give you an example.
So years ago I was at a startup, and we used NT for servers. And we had some kind of
tools that wanted to run on NT, and there was a bug with the printer. So we reported
the bug and a year later we got an update from Microsoft that did not fix the bug.
It was well known. I don’t know why – it’s printer driver. Like, it wasn’t a real
big problem – probably take somebody eight seconds to fix it. Didn’t get fixed.
We switched to Linux. We had a problem with a different piece of peripheral, but
similar thing. And somebody just, you know, searched and talked to their buddies,
and we got a patch in a half an hour and fixed it. So the the iteration rate [went
from]: we waited a year and they didn’t fix it, to: we found a patch and we fixed it.
Like, the genius of open source is – you know, while it’s a little bumpy, you can fix it.
The software’s… You know, when I was at Tesla we built our own AI software stack.
We had total visibility to the whole thing. We did some optimizations that we could
not do with somebody else’s stuff. So one reason for open source is: you can own it;
the other is you can change it. The other is: you can actually look in the details.
So at Tenstorrent we’re going to open source our software stack for AI. . .
11:05
Another charm of open source software – it’s not a secret mess. Like, everybody knows
how Linux works. You can go read it, right? GCC, LLVM – all the open-source software
stacks are available. And you can look at them, and some are great and some are, you
know, . . . And you can make your own calls. And then somebody might call you up and
say ‘This is a bloody mess, and here’s what you should fix.’ And I think that’s really
cool. . .
14:46
Keller: So here’s an interesting thing. So Intel was the original open-source architecture.
You know? You remember – they had seven sub-licensees. The reason the 8086 beat the Z80
and the 6502 and the 6800 and the 68000 is: there were multiple players. Now Intel did an
especially good job in the combination of foundry and architecture. For years [it] was a winner.
And in some sense having AMD as a real competitor has kept them honest in a way that, you
know. . . Like, some proprietary things without the competitive pressure, you know, lose it.
So no, I think the open-source world is important. And we’ve seen that dynamic play out.
Like, there was an explosion of minicomputer companies, right? And then when they got successful
they started ignoring the market. And then there was workstation companies. And then the PC
world, you know, basically decimated the old computer world. And so these changes have happened
multiple times, and they will continue to happen. At the top of a peak everybody thinks
‘Why would it change?’. I don’t know – history? That’s what books are for, you know?
A hundred years of technology, you know, evolution? Geez, it’s amazing. . .
16:24
Keller: Yeah, so, my belief is over the next five to ten years RISC-V will take over all the
data centers. I think that’s true.
Sally Ward-Foxton: Even supercomputers?
Keller: Oh, especially supercomputers!
Sally Ward-Foxton: Because it’s customizable?
Keller: Oh yeah. We’ve talked to multiple supercomputer guys, and they’re like: ‘We’re the world’s
experts on vectorizing compilers, vector floating-point performance, local memory.’ And then they go
to the big vendors and they say ‘I want these changes’. And the vendors all sensibly say, ‘No’.
Right? So… but they’re actually the experts. Like, they know what they want, right? So, yeah.
So supercomputing could happen faster. Now, the surprising thing about RISC-V is how fast the
software ecosystem is moving. And there’s two reasons for that: one is – I explained this to
some friends at AMD and they were somewhat pained about this, right? The data center ran
on Intel, right? A lot of that was open-source Linux, but a lot of it’s proprietary
system-management software. So when AMD showed up with a pretty good product, it didn’t work.
Because AMD didn’t have access to all of Intel’s proprietary system-management code. People actually
had to write it. And a lot of it got written in open-source and C. So when they ported to ARM,
you’d think: well, Intel to AMD is easy, right? And AMD to ARM is harder. No, it’s easier!
C programs! Compilers work! The problem wasn’t whether it was x86 or ARM underneath, the problem
was… or proprietary software or not. They were worried about the wrong thing. Now, ARM has its
own teething pains to become a server platform. So, we started a joint venture with a company in
India called Bodhi, who’s going to build server products. And they have really great software guys
there, and they brought up. . . So, there’s an emulator called Qemu, there’s an extension for RISC-V,
there’s a virtualization software stack which they ran, and then they put Linux on top of that
and ran multiple applications. So we’ve actually, in less than six months, brought up a server
software stack on a RISC-V emulator. I was amazed! I thought that was going to take a year. I thought
they were gonna, you know. . . And then Google of all people is porting Android to RISC-V. A couple
other big companies have internal RISC-V developments. Rivos has a great software team. SiFive has
been pushing this for years. Ventana’s a good company. They’ve got some really good engineers – some
of those guys are my friends. And so, I actually think the RISC-V change is going to accelerate. . .
[24] Jim Keller Interview 2: AI Hardware on PCIe Cards
TechTechPotato
Feb 24, 2023
(interview with Ian Cutress)
7:30
Cutress: So, [I’ve] been speaking with Wei-han, your RISC-V chief architect. Why do you have. . .
It’s a two-part question, really. Why have you got cores in your AI chip at all? And why
are they RISC-V cores?
Keller: Yeah, there’s a couple [of issues]. So why RISC-V processors at all? So first,
our Tensix processor has five little RISC-V cores in them. We call them the baby RISC,
right? And they do stuff like fetch data, execute math operands, push data around, manage
the NoC [network-on chip], do some other things. And they’re RISC-V partly because we
could write it to do what we want, we don’t have to ask anybody for permission, we could
change. . . The baby RISC actually leaves some stuff out – it’s pretty simplified.
In the future generation we enhanced the math capabilities and fixed a whole bunch of
stuff so we could talk to the local hardware control pretty directly. So it’s ours.
We can do anything we want. We put in RISC-V in our next generation chip, which we’re
taping out soon, in part because we went and asked the other vendors to add some
floating-point formats for us [for] AI. So we’re keen on AI floating-point formats: accuracy-,
precision-stuff. And AI programs have to be – because you want to drive the small
floating-point data sizes but maintain the accuracy across, you know, billions of operations.
And the RISC-V guy said: ‘Sure, go call up SiFive. Say, “Can you do this?” I’m pretty sure. . .’
So we did. So that’s why RISC-V is in there now. And [in] the future – we think AI and
CPU integration is going to be interesting. And we want to be able to do what we want.
And I don’t want to have to ask somebody for permission to add a data type, or add a port
from the processor to this, or change how the data movement engine works. I just want to be
able to do it. . .