## Divide By Depth??

DavidS
Posts: 3800
Joined: Thu Dec 15, 2011 6:39 am
Location: USA
Contact: Website

### Divide By Depth??

I am working on a couple of projects that include simple software 3D rendering (in RISC OS), and keep getting frustrated at the need to have a division routine to maintain backwards compatibility to the ARMv6 (eg RPi B/B+). All of the math is integer, done in pure ARM (no ARM extensions used), rotations and similar are done by value scaling and using scaled sin/cosin tables.

I would like it if someone knows a trick to perform the perspective calculations without using division, or at least a way to keep it to just powers of 2 (so simple shifts will do the trick).

The dataset being used is that of imported models/scenes, so I am limited in how much I can customise the data to simplify life. Even importing the original data (made with CAD and other 3D editing software) requires translating floating point values into scaled integer values.
RPi = Way for me to have fun and save power.
100% Off Grid.
Household TTL Electricity Usage = 1.4KW/h per day.
500W Solar System, produces 2.8KW/h per day average.

DavidS
Posts: 3800
Joined: Thu Dec 15, 2011 6:39 am
Location: USA
Contact: Website

### Re: Divide By Depth??

It would apear that the biggest part of my slow down was elsewhere, namely in the Linear Interpolation algorithm to implement simple Gouraud shading.

So I am looking for any better Linear Interpolation algorithms that lend themselves well to integer only implementation, and may be faster than mine.

This is all the easy part, as I am doing a quick get it working fast enough even on low end hardware in ARM Assembly language right now.

The hard part will be after I get it working, then I will have to do a complete rewrite in C.
RPi = Way for me to have fun and save power.
100% Off Grid.
Household TTL Electricity Usage = 1.4KW/h per day.
500W Solar System, produces 2.8KW/h per day average.

ejolson
Posts: 2874
Joined: Tue Mar 18, 2014 11:47 am

### Re: Divide By Depth??

DavidS wrote:
Sun Nov 11, 2018 10:47 pm
It would apear that the biggest part of my slow down was elsewhere, namely in the Linear Interpolation algorithm to implement simple Gouraud shading.

So I am looking for any better Linear Interpolation algorithms that lend themselves well to integer only implementation, and may be faster than mine.

This is all the easy part, as I am doing a quick get it working fast enough even on low end hardware in ARM Assembly language right now.

The hard part will be after I get it working, then I will have to do a complete rewrite in C.
Have you looked at the Doom source code to see how things are done there? That have contains a full software 3D rendering system.

jamesh
Raspberry Pi Engineer & Forum Moderator
Posts: 22064
Joined: Sat Jul 30, 2011 7:41 pm

### Re: Divide By Depth??

DavidS wrote:
Sun Nov 11, 2018 10:47 pm
It would apear that the biggest part of my slow down was elsewhere, namely in the Linear Interpolation algorithm to implement simple Gouraud shading.

So I am looking for any better Linear Interpolation algorithms that lend themselves well to integer only implementation, and may be faster than mine.

This is all the easy part, as I am doing a quick get it working fast enough even on low end hardware in ARM Assembly language right now.

The hard part will be after I get it working, then I will have to do a complete rewrite in C.
Using fixed point in a standard linear interplation algo might be the best option. Although the ARM does have a floating point block, so the speed difference might not be huge. You can probably get away with relatively small number of fixed decimal places since its only for a shading calc.#

The Videocore PWL blocks use fixed point, of varying depths, which is accurate enough for most, for example, camera algorithms, which do a lot of linear interpolation.
Principal Software Engineer at Raspberry Pi (Trading) Ltd.
Contrary to popular belief, humorous signatures are allowed. Here's an example...
"My grief counseller just died, luckily, he was so good, I didn't care."

Heater
Posts: 12151
Joined: Tue Jul 17, 2012 3:02 pm

### Re: Divide By Depth??

DavidS,
I am doing a quick get it working fast enough even on low end hardware in ARM Assembly language right now....The hard part will be after I get it working, then I will have to do a complete rewrite in C.
Isn't this backwards?

Typically programmers start out developing their algorithms in a higher level language, say C or Pascal, where it is easier to verify ones logic and tweak the algorithm around. Or swap an algorithm for another.

When that is done at least there exists a working version of the code that is useful and cross-platform.

Then, if performance demands it they look into what needs optimizing and may perhaps rewrite critical parts in assembler.

Of course now a days it's very hard for a hand assembly language programmer to outperform a C compiler. So why bother, it's a lot of work with little gain and only works on one architecture.

You may find using floating point is quicker if you have a lot of multiplies and divides going on.

How is your Extended Pascal Implementation implementation coming along?

DavidS
Posts: 3800
Joined: Thu Dec 15, 2011 6:39 am
Location: USA
Contact: Website

### Re: Divide By Depth??

Heater wrote:
Mon Nov 12, 2018 2:52 pm
DavidS,
I am doing a quick get it working fast enough even on low end hardware in ARM Assembly language right now....The hard part will be after I get it working, then I will have to do a complete rewrite in C.
Isn't this backwards?

Typically programmers start out developing their algorithms in a higher level language, say C or Pascal, where it is easier to verify ones logic and tweak the algorithm around. Or swap an algorithm for another.

When that is done at least there exists a working version of the code that is useful and cross-platform.

Then, if performance demands it they look into what needs optimizing and may perhaps rewrite critical parts in assembler.
I actually am using an HLL to test the algorithms before going to assembly. The HLL in use is BBC BASIC V also known as ARM BASIC.

Though I am doing assembly before C because it is easier to code in assembly for me than in C. Do not get me wrong C is prety easy, just not as easy as assembly, which is not as easy as BASIC V.
Of course now a days it's very hard for a hand assembly language programmer to outperform a C compiler. So why bother, it's a lot of work with little gain and only works on one architecture.
Three reasons to bother:
• 1 : ARM Assembly is simpler than C.
2 : The target OS is RISC OS, which is tied to the ARM CPU.
3 : There are some gains, however small, especially when working in pure ARM (no extensions).
You may find using floating point is quicker if you have a lot of multiplies and divides going on.
Only two devides per vertex per rendering, and only one other divide in the rest of the program (unless you cound shift rights) that is only executed a single time at startup.

Not if using pure ARM ISA. That is no extensions, so no VFP/NEON as they are optional extensions, not part of the core API. If I were alloing myself the use of the extensions then yes I would agree.
How is your Extended Pascal Implementation implementation coming along?
I have not done a lot with that in a while. Remember I have had a few medical issues since that was started. It kind of works, though I have done very little with it in quite a while.

I am actually having to relearn a lot of what I knew well after my stroke.
RPi = Way for me to have fun and save power.
100% Off Grid.
Household TTL Electricity Usage = 1.4KW/h per day.
500W Solar System, produces 2.8KW/h per day average.

DavidS
Posts: 3800
Joined: Thu Dec 15, 2011 6:39 am
Location: USA
Contact: Website

### Re: Divide By Depth??

jamesh wrote:
Mon Nov 12, 2018 1:21 pm
DavidS wrote:
Sun Nov 11, 2018 10:47 pm
It would apear that the biggest part of my slow down was elsewhere, namely in the Linear Interpolation algorithm to implement simple Gouraud shading.

So I am looking for any better Linear Interpolation algorithms that lend themselves well to integer only implementation, and may be faster than mine.

This is all the easy part, as I am doing a quick get it working fast enough even on low end hardware in ARM Assembly language right now.

The hard part will be after I get it working, then I will have to do a complete rewrite in C.
Using fixed point in a standard linear interplation algo might be the best option. Although the ARM does have a floating point block, so the speed difference might not be huge. You can probably get away with relatively small number of fixed decimal places since its only for a shading calc.#

The Videocore PWL blocks use fixed point, of varying depths, which is accurate enough for most, for example, camera algorithms, which do a lot of linear interpolation.
Yes indeed. Using fixed point where need be is prety much what I am doing, with increments of 1/256th (as shl #8/shr #8 make for conversion from/to pure integer).

The pure ARM core has FP? I thought that FP was only in the optional extensions like VFP/NEON, is there something I do not know?
RPi = Way for me to have fun and save power.
100% Off Grid.
Household TTL Electricity Usage = 1.4KW/h per day.
500W Solar System, produces 2.8KW/h per day average.

jamesh
Raspberry Pi Engineer & Forum Moderator
Posts: 22064
Joined: Sat Jul 30, 2011 7:41 pm

### Re: Divide By Depth??

DavidS wrote:
Mon Nov 12, 2018 3:12 pm
jamesh wrote:
Mon Nov 12, 2018 1:21 pm
DavidS wrote:
Sun Nov 11, 2018 10:47 pm
It would apear that the biggest part of my slow down was elsewhere, namely in the Linear Interpolation algorithm to implement simple Gouraud shading.

So I am looking for any better Linear Interpolation algorithms that lend themselves well to integer only implementation, and may be faster than mine.

This is all the easy part, as I am doing a quick get it working fast enough even on low end hardware in ARM Assembly language right now.

The hard part will be after I get it working, then I will have to do a complete rewrite in C.
Using fixed point in a standard linear interplation algo might be the best option. Although the ARM does have a floating point block, so the speed difference might not be huge. You can probably get away with relatively small number of fixed decimal places since its only for a shading calc.#

The Videocore PWL blocks use fixed point, of varying depths, which is accurate enough for most, for example, camera algorithms, which do a lot of linear interpolation.
Yes indeed. Using fixed point where need be is prety much what I am doing, with increments of 1/256th (as shl #8/shr #8 make for conversion from/to pure integer).

The pure ARM core has FP? I thought that FP was only in the optional extensions like VFP/NEON, is there something I do not know?
All the Arm cores on all Raspberry Pi models have floating point units, so you can lilterally just execute floating point instructions from the Arm instruction set.
Principal Software Engineer at Raspberry Pi (Trading) Ltd.
Contrary to popular belief, humorous signatures are allowed. Here's an example...
"My grief counseller just died, luckily, he was so good, I didn't care."

jahboater
Posts: 4186
Joined: Wed Feb 04, 2015 6:38 pm

### Re: Divide By Depth??

DavidS wrote:
Mon Nov 12, 2018 3:12 pm
The pure ARM core has FP? I thought that FP was only in the optional extensions like VFP/NEON, is there something I do not know?
I think what you mean is that for the older ARM CPU's, the floating point unit was an optional co-processor.

All Pi models from day one had this co-processor, so they all have hardware floating point (single and double precision) as standard.

Now! For interest ...
The Pi2, Pi3 and Pi3+ have ARMv8 CPU's - and for ARMv8, floating point/SIMD is no longer a co-processor, it is built in to the ARM core.
So VFP and NEON is not even an option for the bigger Pi's, it is guaranteed to be present.

On any Pi model you can use floating point freely and safely.

DavidS
Posts: 3800
Joined: Thu Dec 15, 2011 6:39 am
Location: USA
Contact: Website

### Re: Divide By Depth??

jahboater wrote:
Mon Nov 12, 2018 3:31 pm
DavidS wrote:
Mon Nov 12, 2018 3:12 pm
The pure ARM core has FP? I thought that FP was only in the optional extensions like VFP/NEON, is there something I do not know?
I think what you mean is that for the older ARM CPU's, the floating point unit was an optional co-processor.

All Pi models from day one had this co-processor, so they all have hardware floating point (single and double precision).
Yes all Raspberry Pi models have the optional extension. Though it is still optional, and if the code ever needs ported to another RISC OS system using these extensions could be a problem.

Now some of what I am doing is definitely specific to RPi HW, in that I am using GPIO's for some of this, and doing so directly through the HW registers (not using the GPIO module of RISC OS), though that should be able to be ported as well.

At this time the Raspbeery Pi 3B is the quickest RISC OS machine I can afford, and the Raspberry Pi B+ is the most versitile RISC OS machine that seems to be on the market.
Now! For interest ...
The Pi2, Pi3 and Pi3+ have ARMv8 CPU's - and for ARMv8, floating point/SIMD is no longer a co-processor, it is built in to the ARM core.
So VFP and NEON is not even an option for the bigger Pi's, it is guaranteed to be present.
That is for the RPi, not so much for other systems. While I am targetting the RPi this should not give me permission to make it more difficult for someone to port it to another ARM system with RISC OS in the future. And while maybe not optional for the ARMv8 still extensions.
On any Pi model you can use floating point freely and safely.
On any Pi model.
RPi = Way for me to have fun and save power.
100% Off Grid.
Household TTL Electricity Usage = 1.4KW/h per day.
500W Solar System, produces 2.8KW/h per day average.

jahboater
Posts: 4186
Joined: Wed Feb 04, 2015 6:38 pm

### Re: Divide By Depth??

DavidS wrote:
Mon Nov 12, 2018 3:53 pm
The Pi2, Pi3 and Pi3+ have ARMv8 CPU's - and for ARMv8, floating point/SIMD is no longer a co-processor, it is built in to the ARM core.
So VFP and NEON is not even an option for the bigger Pi's, it is guaranteed to be present.
That is for the RPi, not so much for other systems.
No, it is true for all modern ARM CPU's. True for all systems. It is part of the ARMv8 architecture.
NEON is guaranteed present no matter who manufactures the SoC!

See the ARMv8 ARM (if you can found your way around it!)
I see ARMv8 does half precision (16-bit) floating point too, I didn't know that.
Last edited by jahboater on Mon Nov 12, 2018 4:15 pm, edited 1 time in total.

jamesh
Raspberry Pi Engineer & Forum Moderator
Posts: 22064
Joined: Sat Jul 30, 2011 7:41 pm

### Re: Divide By Depth??

Are there any current SoC scale ARM devices without FPU's? Not talking M0 type of stuff, but cores that go on SBC's.
Principal Software Engineer at Raspberry Pi (Trading) Ltd.
Contrary to popular belief, humorous signatures are allowed. Here's an example...
"My grief counseller just died, luckily, he was so good, I didn't care."

DavidS
Posts: 3800
Joined: Thu Dec 15, 2011 6:39 am
Location: USA
Contact: Website

### Re: Divide By Depth??

jamesh wrote:
Mon Nov 12, 2018 4:04 pm
Are there any current SoC scale ARM devices without FPU's? Not talking M0 type of stuff, but cores that go on SBC's.
I am not sure if there are still any ARMv5 devices around, or any of the older ARMv6/7. I know it used to be mored common to see ARM without the extensions, even in the early v7 era, though there may not be anymore.
RPi = Way for me to have fun and save power.
100% Off Grid.
Household TTL Electricity Usage = 1.4KW/h per day.
500W Solar System, produces 2.8KW/h per day average.

DavidS
Posts: 3800
Joined: Thu Dec 15, 2011 6:39 am
Location: USA
Contact: Website

### Re: Divide By Depth??

jahboater wrote:
Mon Nov 12, 2018 4:02 pm
DavidS wrote:
Mon Nov 12, 2018 3:53 pm
The Pi2, Pi3 and Pi3+ have ARMv8 CPU's - and for ARMv8, floating point/SIMD is no longer a co-processor, it is built in to the ARM core.
So VFP and NEON is not even an option for the bigger Pi's, it is guaranteed to be present.
That is for the RPi, not so much for other systems.
No, it is true for all modern ARM CPU's. True for all systems. It is part of the ARMv8 architecture.
NEON is guaranteed present no matter who manufactures the SoC!

See the ARMv8 ARM (if you can found your way around it!)
I see ARMv8 does half precision (16-bit) floating point too, I didn't know that.
One more time, no-one forces the use of ARMv8. There are still new ARMv7 SoC's being made, including some of the fastest RISC OS machines around today.

I am sure that it will be a good while before we begin to see the ARMv6 and ARMv7 devices drop away, there is a reason that there are hard-float and soft-float Linux distro's for the ARM.
RPi = Way for me to have fun and save power.
100% Off Grid.
Household TTL Electricity Usage = 1.4KW/h per day.
500W Solar System, produces 2.8KW/h per day average.

DavidS
Posts: 3800
Joined: Thu Dec 15, 2011 6:39 am
Location: USA
Contact: Website

### Re: Divide By Depth??

Back on topic:
I am making some good progress in figuring things out. Using the same basic technique as implemented in Bresenham's Run-Sliced line drawing algorithm the linear interpolation has been greatly sped up, for every use case (not just shading).

I am still thinking about having an option for flat shading if the user wants a lot more speed and does not mind the loss of visual quality that goes with it.

As soon as I get access to my web site for updating again, I will go ahead and upload the WIP versions. It used to be that I would not release anything until I was extremely happy with the way it works, though now I realize the value in sharing the early stages of a project.
RPi = Way for me to have fun and save power.
100% Off Grid.
Household TTL Electricity Usage = 1.4KW/h per day.
500W Solar System, produces 2.8KW/h per day average.

jahboater
Posts: 4186
Joined: Wed Feb 04, 2015 6:38 pm

### Re: Divide By Depth??

jamesh wrote:
Mon Nov 12, 2018 4:04 pm
Are there any current SoC scale ARM devices without FPU's? Not talking M0 type of stuff, but cores that go on SBC's.
I don't know of any, but I haven't done an exhaustive search!

jamesh
Raspberry Pi Engineer & Forum Moderator
Posts: 22064
Joined: Sat Jul 30, 2011 7:41 pm

### Re: Divide By Depth??

DavidS wrote:
Mon Nov 12, 2018 4:20 pm
jahboater wrote:
Mon Nov 12, 2018 4:02 pm
DavidS wrote:
Mon Nov 12, 2018 3:53 pm

That is for the RPi, not so much for other systems.
No, it is true for all modern ARM CPU's. True for all systems. It is part of the ARMv8 architecture.
NEON is guaranteed present no matter who manufactures the SoC!

See the ARMv8 ARM (if you can found your way around it!)
I see ARMv8 does half precision (16-bit) floating point too, I didn't know that.
One more time, no-one forces the use of ARMv8. There are still new ARMv7 SoC's being made, including some of the fastest RISC OS machines around today.

I am sure that it will be a good while before we begin to see the ARMv6 and ARMv7 devices drop away, there is a reason that there are hard-float and soft-float Linux distro's for the ARM.
But do any of the currently made ARM6 and v7 devices omit the FPU? Seems unlikely. The Pi ARM6 chip certainly has one.

I was going to suggest a mod on Bressenhams line drawing algorithm, but I suspected that might be less efficient.
Principal Software Engineer at Raspberry Pi (Trading) Ltd.
Contrary to popular belief, humorous signatures are allowed. Here's an example...
"My grief counseller just died, luckily, he was so good, I didn't care."

DavidS
Posts: 3800
Joined: Thu Dec 15, 2011 6:39 am
Location: USA
Contact: Website

### Re: Divide By Depth??

jamesh wrote:
Mon Nov 12, 2018 4:37 pm
DavidS wrote:
Mon Nov 12, 2018 4:20 pm
jahboater wrote:
Mon Nov 12, 2018 4:02 pm
No, it is true for all modern ARM CPU's. True for all systems. It is part of the ARMv8 architecture.
NEON is guaranteed present no matter who manufactures the SoC!

See the ARMv8 ARM (if you can found your way around it!)
I see ARMv8 does half precision (16-bit) floating point too, I didn't know that.
One more time, no-one forces the use of ARMv8. There are still new ARMv7 SoC's being made, including some of the fastest RISC OS machines around today.

I am sure that it will be a good while before we begin to see the ARMv6 and ARMv7 devices drop away, there is a reason that there are hard-float and soft-float Linux distro's for the ARM.
But do any of the currently made ARM6 and v7 devices omit the FPU? Seems unlikely. The Pi ARM6 chip certainly has one.

I was going to suggest a mod on Bressenhams line drawing algorithm, but I suspected that might be less efficient.
To quote what I said when you asked above:
DavidS wrote: I am not sure if there are still any ARMv5 devices around, or any of the older ARMv6/7. I know it used to be mored common to see ARM without the extensions, even in the early v7 era, though there may not be anymore.
So I am not sure, in short.
RPi = Way for me to have fun and save power.
100% Off Grid.
Household TTL Electricity Usage = 1.4KW/h per day.
500W Solar System, produces 2.8KW/h per day average.

Paeryn
Posts: 2517
Joined: Wed Nov 23, 2011 1:10 am
Location: Sheffield, England

### Re: Divide By Depth??

jahboater wrote:
Mon Nov 12, 2018 4:02 pm
I see ARMv8 does half precision (16-bit) floating point too, I didn't know that.
It only supports half precision as a data type when converting, it can't do any arithmetic on half precision values.
She who travels light — forgot something.

Heater
Posts: 12151
Joined: Tue Jul 17, 2012 3:02 pm

### Re: Divide By Depth??

So given that there are no users of the restricted ARM instruction set that you are targeting why are you making life difficult for yourself by targeting it?

ejolson
Posts: 2874
Joined: Tue Mar 18, 2014 11:47 am

### Re: Divide By Depth??

Paeryn wrote:
Mon Nov 12, 2018 5:38 pm
jahboater wrote:
Mon Nov 12, 2018 4:02 pm
I see ARMv8 does half precision (16-bit) floating point too, I didn't know that.
It only supports half precision as a data type when converting, it can't do any arithmetic on half precision values.
I think if you are running in 64-bit ARMv8 mode the NEON FPU supports vector operations on eight half-precision floats at a time. It is possible that access to half-precision floating point is a more compelling reason for a 64-bit operating system than any of the other points mentioned so far.

I think 64-bit mode might be easier than SMP multiprocessing. Are there any development efforts towards making RISCOS run in 64-bit mode?

DavidS
Posts: 3800
Joined: Thu Dec 15, 2011 6:39 am
Location: USA
Contact: Website

### Re: Divide By Depth??

Heater wrote:
Mon Nov 12, 2018 5:39 pm
So given that there are no users of the restricted ARM instruction set that you are targeting why are you making life difficult for yourself by targeting it?
Every ARM user is a user of the NON-Restrictive instruction set I am using. The restrictive instruction set in the case of users is to use the extensions.

Regardless of what is currently on the market, there are still people running IYONIX and RISC PC systems, both of which do not have these extension. There are also still newer single boards that are now out of production though still used and still do not have the extensions.

Remember that RISC OS is a small enough target audience as it is, so why would I want to make it even smaller by alienating a good 30% of the users of the OS (those that stick with there old IYONIX, and those that refuse to upgrade from the RISC PC because of the lack of the 26-bit R15 modes in the newer ARM CPU's).

And in the future we are likely to see more and more 32-bit systems that do not have these extension, thanks to OpenCores projects. There is already Amber Core that is 100% ARMv2 compatible, and there are others that having verying degrees of completion. There are also people playing with newer ARM Cores, including up to ARMv5 archetecures, though still unable to do much with them do to ARM still holding IP rights that prevent doing more than personal usage stuff.

Though a ARMv2a ISA CPU with greater than 1 instruction per second (Amber), is still quite usable in modern times, so long as you add an MMU. There is nothing stoping it from addressing a 32 bit address space for data, it only needs code to be withing the first 64MB do to the limits of R15 sharing the processor status register bits, the MMU takes care of that. And there is nothing stoping someone from doing an ASIC implementation of Amber that could run at over a GHz, if someone chose to do so for a comercial product (as that would be the only cost effective way).

Pair amber with some very simple periphials, and one of the Open source GPU's, implement the ASIC, slap a good PMMU in the system, and put it on a board with 3GB of DDR RAM and that would be the perfect next gen Raspberry Pi, especially if the ASIC had 8 of these Amber Cores (and the associated PMMU's). Though that is just dreaming.
RPi = Way for me to have fun and save power.
100% Off Grid.
Household TTL Electricity Usage = 1.4KW/h per day.
500W Solar System, produces 2.8KW/h per day average.

DavidS
Posts: 3800
Joined: Thu Dec 15, 2011 6:39 am
Location: USA
Contact: Website

### Re: Divide By Depth??

Though I am makinig some progress, I have improved my sine tables to be down to a single degree (within tolarance for the scale I am using for fixed point).

I have also began playing around with using a slightly modified form of Bresenham's Run-Sliced Line Drawing algorithm (which is way faster than the normal Presenham's Line Drawing Algorithm) for the interpolation for shading, it seems to be quite fast, and fits the application very well.

Now I need to tune my rotation algorithms a little bit, then it is on to improving the path tracing algorithms. Thankfully spliting the model into layers is very simple and fast, as is comparing adjacent layers for rate of change.

I think I may have given up what one of the projects is, two of the others are related very strongly, and will share a lot of the 3D rendering code .
RPi = Way for me to have fun and save power.
100% Off Grid.
Household TTL Electricity Usage = 1.4KW/h per day.
500W Solar System, produces 2.8KW/h per day average.

Heater
Posts: 12151
Joined: Tue Jul 17, 2012 3:02 pm

### Re: Divide By Depth??

DavidS,

I don't understand. Could you explain what you mean by "restrictive" or "non-restrictive" instruction set. What restrictions are we talking about?

When it comes to Intellectual Property rights all ARM instruction sets are restricted. You need a license from ARM to implement them.

From a practical point of view I would say that limiting oneself to the instruction set of some early ARM designs is very restrictive. Unnecessarily so as there are almost zero number of users of such historical artifacts.
And there is nothing stopping someone from doing an ASIC implementation of Amber that could run at over a GHz, if someone chose to do so for a commercial product (as that would be the only cost effective way).
Is that really true?

If I understand there are no free for use ARM instruction sets and if anyone starts implementing them commercially they will be hearing from ARM's lawyers.

The only free for use ARM cores I have heard of are the recently announced Cortex-M1 for use in FPGA. https://bit-tech.net/news/tech/cpus/arm ... x-cores/1/
Pair amber with some very simple peripherals, and one of the Open source GPU's, implement the ASIC, slap a good PMMU in the system, and put it on a board with 3GB of DDR RAM and that would be the perfect next gen Raspberry Pi, especially if the ASIC had 8 of these Amber Cores (and the associated PMMU's). Though that is just dreaming.
Interesting dream.

Not going to happen due to the IP restrictions.

However I believe people are working on making such dreams come true using the Open Source RISC V instruction set specification. An early example of that kind of dream is this dual core, 64 bit, RISC V module with accelerators for DSP, FFT and neural nets built in: https://item.taobao.com/item.htm?id=578484113485

I like to think our future is not based on ARM, ARM Holdings or Soft Bank.

Gavinmc42
Posts: 2884
Joined: Wed Aug 28, 2013 3:31 am

### Re: Divide By Depth??

Pair amber with some very simple periphials, and one of the Open source GPU's, implement the ASIC, slap a good PMMU in the system, and put it on a board with 3GB of DDR RAM and that would be the perfect next gen Raspberry Pi, especially if the ASIC had 8 of these Amber Cores (and the associated PMMU's). Though that is just dreaming.
Open source GPU's? I must have missed them, which ones are you referring too?
I have been thinking about learning RISC-V but I would like graphics.
I'm dancing on Rainbows.
Raspberries are not Apples or Oranges