Please create an account to participate in the Slashdot moderation system


Forgot your password?

Slashdot videos: Now with more Slashdot!

  • View

  • Discuss

  • Share

We've improved Slashdot's video section; now you can view our video interviews, product close-ups and site visits with all the usual Slashdot options to comment, share, etc. No more walled garden! It's a work in progress -- we hope you'll check it out (Learn more about the recent updates).


Comment: Wirth's law protects us from singularity (Score 3, Interesting) 181

by hkultala (#48520573) Attached to: Do you worry about the singularity?

There will never be enough processing power to create powerful enough AI that singularity will happen.

Wirth's law states that software gets slower due bloatness faster than Moore's law allows hardware to get faster.

We have moved from handcoded assemly and simple binary data format format to javascript which is either interpreted very slowly, or JIT-compiled into slightly faster code which is still 10 times slower than assembly, and XML or JSON-based data formats (which require a LOT of parsing). Now other languages are being complied to javascript, which adds another slowness layer on top of it.

So, it we invented a super-powerful AI that would be capable of creating truely smart code, it would spend it's time creating even more bloaty abstraction layers on top of each others, instead of creating anything that would be truely more intelligent.

Comment: Re:Microcode switching (Score 2) 161

by hkultala (#47776257) Attached to: Research Shows RISC vs. CISC Doesn't Matter

This same myth keeps being repeated by people who don't really understand the details on how processors internally work.

Actually, YOU are wrong.

You cannot just change the decoder, the instruction set affect the internals a lot:

All the reason you list could all be "fixed in software".

No, they cannot. OR the software will be terible slow , like 2-10 times slowdown.

The fact that silicon designed by Intel handles opcode in a way a little bit better optimized toward being fed from a x86-compatible frontend is just specific optimisation.

Opcodes are irrelevant. They are easy to translate. What matters are the differences in the semantics of the instructions.
X86 instructions update flags. This adds dependencies between instructions. Most RISC processoers do not have flags at all.
This is semantics of instructions, and they differ between ISA's.

Simply doing the same stuff with another RISCy back-end, i.e: interpreting the same ISA fed to the front-end, will simply require each x86 ISA being executed as a different set of micro-instructions. (some that are handled as single ALU opcode on Intel's silicon might require a few more instruction, but that's about the different).

The backend, the micro-instrucions in x86 CPUs are different than the instructions in RISC CPU's. They differ in the small details I tried to explain.

You could switch the frontend and speak a completely different instruction set. Simply if the two ISA are radically different, the result wouldn't be as efficient as a chip designed with that ISA in mind. (You would need a much bigger and less efficient microcode, because of all the reasons you list. They won't STOP intel from making a chip that speaks something else.

Intel did this, they added x86 decoder to their first itanium chips. And. They did not only add the frontend, they added some small pieces to their backend so that it could handle those strange x86 semantic cases nicely.
But the perfromance was still so terrible that nobody ever used it to run x86 code, and then they created a software translator that translated x86 code into itanium code, and that was faster, though still too slow.

Not only is this possible, but this was INDEED done.

There was an entire company called "Transmeta" whose business was centered around exactly that:
Their chip, the "Crusoe" was compatible with x86.
- But their chip was actually a VLIW chips, with the front-end being 100% pure software. Absolutely as remote from a pure x86 core as possible.'

The backend of Crusoe was designed completely x86 on mind, all the execution units contained the small quirks in a manner which made it easy to emulate x86 with it. The backend of Crusoe contains things like:

* 80-bit FPU,
* x86-compatible virtual memory page table format(one very important thing I forgot from my original list couple of posts ago; Memory accesses get VERY SLOW if you have to emulate virtual memory)
* support for partial register writes(to emulate 8- and 16-bit subregisters like al, ah,ax )

All these were made to make binary translation from x86 easy and reasonable fast.

Comment: Re:This is a myth that is not true (Score 4, Informative) 161

by hkultala (#47775093) Attached to: Research Shows RISC vs. CISC Doesn't Matter

Some of what you said is legitimate. Most of it is irrelevant, since it does not speak to the postulate. You're speaking of issues which will affect performance. So what? You'd have a less-performant processor in some cases, and it would be faster in others.


1) if the codition codes work totally differently, they don't work.

2) The data paths needed for separate and compined FP and integer regs are so different that it makes absolutely NO sense to have them together in chip that runs x86 ISA, even though it's possible.

3) If you don't have those x86-compatible address calculation units, you have to break most of memory ops into more micro-ops OR even run them with microcode. Both are slow. And if you have a RISC chip you want to have only the address calculation units you need for your simple base+offset addressing.

4) In the basic RISC pipeline there are two operands, one output/instruction. There are no data paths for two results, you cannot execute operations with multiple outputs such as x86 muliply which produces 2 values(low and high part of result), unless you do something VERY SLOW.

6) IF your RISC instruction set says you have aligned memory operations, you design your LSU to have only those, as it makes the LSU's much smaller, simpler and faster. But you need unaligned accesses for x86.

9) If your FPU calculates with different bit width, it calculates wrongly.


Comment: Re:isn't x86 RISC by now? (Score 1) 161

by hkultala (#47774469) Attached to: Research Shows RISC vs. CISC Doesn't Matter

After AMD lost the license to manufacture Intel i486 processors, together with other people, they were forced to design their own chip from the ground up. So they basically used one of the 29k RISC processors and put an x86 frontend on it.

This was their plan, but it ended up being quite much harder than they originally thought, and K5 came out much later, much different and much slower than planned. There are quite a lot of thigns that have to be done differently (some of them are explained in my another post)

Comment: This is a myth that is not true (Score 5, Informative) 161

by hkultala (#47774375) Attached to: Research Shows RISC vs. CISC Doesn't Matter

That is correct. Every time this comes up I like to spark a debate over what I perceive as the uselessness of referring to an "instruction set architecture" because that is a bullshit, meaningless term and has been ever since we started making CPUs whose external instructions are decomposed into RISC micro-ops. You could switch out the decoder, leave the internal core completely unchanged, and have a CPU which speaks a different instruction set. It is not an instruction set architecture. That's why the architectures themselves have names. For example, K5 and up can all run x86 code, but none of them actually have logic for each x86 instruction. All of them are internally RISCy. Are they x86-compatible? Obviously. Are they internally x86? No, nothing is any more.

This same myth keeps being repeated by people who don't really understand the details on how processors internally work.

You cannot just change the decoder, the instruction set affect the internals a lot:

1) Condition handling is totally different on different instruciton sets. This affect the banckend a lot. X86 has flags registers, many other architectures have predicate registers, some predicate registers with different conditions.

2) There are totally different number of general purpose and floating point registers. The register renamer makes this a smaller difference, but then there is the fact that most RISC's use same registers for both FPU and integer, X86 has separate registers for both. And this totally separates them, the internal buses between the register files and function units in the processor are done very differently.

3) Memory addressing modes are very different. X86 still does relatively complex address calculations on single micro-operation, so it has more complex address calculation units.

4) Whether there are operations with more than 2 inputs, or more than 1 output has quite big impact on what kind of internal buses are needed, how many register read and write ports are needed.

5) There are a LOT of more complex instructions in X86 ISA which are not split into micro-ops but handled via microcode. the microcode interpreter is totally missing on pure RISCs ( but exists on some not-so pure RISC's like Powe/PowerPC).

6) Instruction set dictates the memory aligment rules. Architectures with more strict alignment rules can have simples load-store-units.

7) Instruction set dictatetes the multicore memory ordering rules. This may affect the load-store units, caches and buses.

8) Some instructions have different bitnesses in different architectures. For example x86 has N x X -> 2N wide multiply operations which most RISC's don't have. So x86 needs bigger/different multiplier than most RISCs.

9) X87 FPU values are 80-bit wide(truncated to 64-bit when storing/loading). Practically all the other CPU's have maximum of 64-bit wide FPU values (though some versions Power have support for 128-bit FP numbers also)

Comment: The article is bad - mfg technology dominates (Score 2) 161

by hkultala (#47774131) Attached to: Research Shows RISC vs. CISC Doesn't Matter

They are seriously comparing some 90nm process with much better intel 32nm and 45 nm processes.

They have just taken some random cores made on random (and uncomparable) manufacturing technologies, throw couple of benchmarks and try to declare universal results based on these.

Few facts about the benchmarks setup and the cores cores:

1) They use ancient version of GCC. ARM suffers this much more than x86.
2) Bobcat is relatively balanced core, no bad bottlenecks. mfg tech is cheap, not high performance but relatively small/new.
3) Cortex A8 and A9 are really starved by bad cache design. Newer A7 and A12 would be similar in area and powet consumption but much better in performance and performance/power. There are also manufactured on old cheap mfg processes, which hurt them. Use modern manufacturing tech and results are quite much better
4) Their loonson is made on ANCIENT technology. With modern mfg tech it would be many times better on performance/power.
5) The cortex A15, even though made on 32nm process, is cheap process, not much better than intel's 45nm process and much worse than intel's 32nm. Also it's known to be a "power hog"-design. Qualcomm's Krait has similar performance level, but with much lower power.

Comment: Kaveri is much better as PC chip (Score 2) 105

- Single-thread performance matters much more than multi-thread performance, and Kaveri has almost twice the single-thread performance of the Xbone and PS4 chips.

- Memory bandwidth is expensive. You either need wide and expensive bus, or expensive low-capasity graphics DRAM which need soldering, and means you are limited to 4 GiB of memory(with the highest capasity GDDR chips out there), with zero possibility of late upgrading it, or both(and MAYBE get 8 giB of soldered memory). Though there has been rumours that Kaveri might support GDDR5, for configurations with only 4 GiB of soldered memory.

- And when you have that limited memory bandwidth, it does not make sense to waste die space on creating monster GPU which is starved by the lack of bandwidth.

- ALL the mentioned chips are of same generation. All support cache-coherent unified memory.

As PC chip, Kaveri makes much more sense:

- Software that matters on PC cannot use 8 threads. Kaveri is much faster at most software
- Weaker GPU side, ability to use cheap DDR3, and narrower memory bus makes Kaveri chip and kaveri-bases systems cheaper to manufacture
- The CPU can be socket, need to to be soldered, and the memory chips can use DIMMs instead of soldering to motherboard. Ability to upgrade something and system manufacturers to easily create different configurations.

Comment: Aluminium is the fuel, not water (Score 5, Informative) 85

by hkultala (#41145687) Attached to: Micromotors Race About By Turning Water Into Hydrogen Gas

argh, again this kind of misleading headline that makes the people who only read the headline think a perpetual machine is finally invented.

The energy comes from aluminium, aluminium "burning" into aluminium-oxide.

Putting the "converting water into hydrogen" into headline is misleading reporting.

Comment: The world needs a good and open mobile OS (Score 2) 63

by hkultala (#40673961) Attached to: MeeGo Startup Jolla Signs Phone Deal

WP has crappy multitasking, and all your data are belong to Microsoft.

With IOS all your data are belong to Apple. And everything is controller by Apple.

With Android all your data are belong to Google, and performance is bad.

With Symbian the user owns his data, but performance is bad, sw development is really pain, and UI is bad. RIP.

What is needed is operating system that allows the user to own his data, has good performance, and allows the user to use the device the way he wants.
Meego/Mer gives this.

Comment: Jolla does not mean a rescue boat - too small (Score 3, Informative) 200

by hkultala (#40579885) Attached to: Ex-Nokia Staff To Build MeeGo-based Smartphones

I'm a finn,so I know what "Jolla" means.

Jolla means a very small sailing boat - not meant for rescue, but meant for people who want to go sailing alone on a very small boat.
(who either cannot afford bigger boat or just likes very small boats)

Jollas cant be used as rescue boats, they are too small for that.

Comment: Triathlon bike (Score 1) 356

by hkultala (#40339503) Attached to: The bicycle I most often ride is ...

Tires and most components like road racing bikes, but has aerobars and bullhorn bars instead of drop bars.

And has more agressive geometry, 78 degree sea post instead of 73 degree.

So it's faster than road racing bike, but illegal to use on road races.

Simply the fastest way to travel on own power, recumbrants may be faster on straight road, but they are too clumsy to for city traffic.

Comment: Missing option - my local data IS my online data (Score 1) 191

by hkultala (#40223815) Attached to: I assume that my data stored online is ...

I have my own server machine. Most of my data is there.

Then I have my laptop. it has sshd, I can remote-login into that when I leave it on a network.

Then I have my phone(N900) It also has sshd, and www server. It's always in my pocket but all data in are also always online.

Time-sharing is the junk-mail part of the computer business. -- H.R.J. Grosch (attributed)