Forgot your password?

Comment: Re:no price? (Score 1) 64

by TheRaven64 (#47794381) Attached to: MIPS Tempts Hackers With Raspbery Pi-like Dev Board

Just curious, what's so wrong with branch with delay slot and isn't that more native way to look at branch ?

They're a pain for people on both sides of the ISA.

The compiler has to find an instruction that can run after the branch. This is normally trivial for calls, but for conditional branches within a function it's often difficult to find an instruction that you can put there. It has to be one that is either from before the jump (or in both basic blocks after the jump), but that the branch doesn't depend on (because it's executed after the branch instruction). This means that you quite often end up padding the delay slots with nops, which bloats your instruction cache usage. On a superscalar implementation this is the only cost, but on a simple in-order pipeline it's also a completely wasted cycle.

On the other side, it's a pain to implement. It made sense for a three-stage pipeline in the original MIPS, because you always knew the next instruction to fetch. A modern simple pipeline is 5-7 stages though, so your branch is still in register fetch (if there) by the time the delay slot is needed. It doesn't buy anything and it means that, if you're doing any kind of speculative execution (even simple branch prediction, which you really need to do to get moderately good performance) then you have an extra dependency to track - you can't just use the branch as the marker and flush everything after it, you need to do some reordering. In a superscalar implementation, you need to do even more complex things in register renaming to make it work.

Comment: Re:My opinion on the matter. (Score 1) 810

by Guy Harris (#47794125) Attached to: Choose Your Side On the Linux Divide

The whole "under 1024 is safe" is generally regarded for connecting *to* ports under 1024, not receiving connections from them.

It's only "safe" if you 1) trust that the machine to which you're connecting restricts the ability to bind to ports under 1024 (not guaranteed), and that the only people running processes on the machine in question are either trustworthy or are restricted from running programs that bind to those ports (not guaranteed), and that the system services you care about have ports under 1024 (not guaranteed).

And what guarantees that a "system service" (whatever that might mean) has a port under 1024? Perhaps a better scheme for determining whether to trust a service is called for here - one that would probably obviate the need for "privileged ports".

Yes, some services (NFS in particular) want to trust incoming connections from 1024 but they're in the minority.

One would hope that services that trust untrustworthy guarantees would be in the minority; in the best of all possible worlds, they would be completely non-existent.

If I was so inclined as to trust port numbers alone (and for the record, I don't trust incoming port numbers at all)

Good. Ideally, nobody would trust them, and claims such as "It prevents regular user programs from masquerading as system services, which usually sit below port 1024." would be treated as the uninteresting claims that they are.

Comment: Re:There's a lot more going on... (Score 1) 156

by Guy Harris (#47794031) Attached to: Research Shows RISC vs. CISC Doesn't Matter

The discussion was about adding more registers in a CISC architecture, and so CISC functionality is the context.

"CISC functionality" is the ability to execute a given CISC instruction set with acceptable performance. Transistors can be used in several different ways to achieve that, and you can choose to use fewer transistors in one place on a chip in favor of more transistors in another place, and if that choice means you still get better overall performance executing the same instruction set, that choice is a good one.

When you ask what "the same functionality means" that is absurd. You can't implement a subset of the functionality and still have the same functionality.

Again, as long as the full instruction set can be executed (even if some of it is executed by trap code), you don't have a subset. You may happen to execute some functions slower, and other functions faster, but if the net result is faster execution of the code actually run on the machine, you have a better implementation.

I'll put this in simpler terms. Smart people design CPUs and they don't add a bunch of registers even though that would be useful.

Smart people add registers iff they're sufficiently useful that it's worth either increasing the die size or taking transistors away from other functions.

The reason they don't do it is because of the additional chip real estate it would cost in an already over-taxed landscape, not because they are lazy or haven't though of the idea.

For existing architectures, the reason they don't do it is that it would require changes to the instruction format, which, for most instruction set architectures, would be a royal pain. For x86, they (AMD, to be specific) could and did add Yet Another Prefix to double the number of registers as the instruction set already had a tradition of adding prefixes. For ARM, they were already introducing a 64-bit variant of the instruction set, and didn't have to maintain binary compatibility. For, for example, System/3x0, you'd have to add prefixes to an instruction set lacking prefixes, or somehow use opcode bits to refer to additional registers. If somebody were to design a brand new CISC architecture (in an era where we're not designing many new instruction set architectures at all), they could design one with 32 GPRs.

Comment: Re:Patent on this new feature (Score 2) 64

by TheRaven64 (#47791453) Attached to: MIPS Tempts Hackers With Raspbery Pi-like Dev Board
No idea. I don't know if the instructions for computing PC-relative addresses in an ISA without an architectural PC are patentable. They also exist in RISC V (not sure which came first), so if they do then it's going to be a problem for Kriste et al. Nothing else in there is especially novel: like ARMv8, it's a nicely designed compilation target, but it doesn't do anything that's especially exciting.

I didn't look at the floating point stuff in much detail, so there may be something there, although the biggest changes in recent versions of the MIPS specs have been that they're more closely aligned with the IEEE floating point standards, so it's hard to imagine anything there.

The biggest difference between MIPS64r6 and ARMv8 is that the MIPS spec explicitly reserves some of the opcode space for vendor-specific extensions (we use this space, although our core predates the current spec - it's largely codifying existing opcode use). This allows, for example, Cavium to add custom instructions that are useful for network switches but not very useful for other things. ARMv8, in contrast, expects that any non-standard extensions are in the form of accelerator cores with a completely different ISA. This means that any code compiled for one ARMv8 core should run on any ARMv8 implementation, which is a big advantage. With MIPS, anything compiled for the core ISA should run everywhere, but people using custom variants (e.g. Cisco and Juniper, who use the Cavium parts in some of their products) will ship code that won't run on another vendors' chips.

Historically, this has been a problem for the MIPS ecosystem because each MIPS vendor has forked GCC and GNU binutils, hacked it up to support their extensions, but done so in a way that makes it impossible to merge the code upstream (because they've broken every other MIPS chip in the process) and left their customers with an ageing toolchain to deal with. I've been working with the Imagination guys to try to make sure that the code in LLVM is arranged in such a way that it's relatively easy to add vendor-specific extensions without breaking everything else.

Imagination doesn't currently have any 64-bit cores to license, but I expect that they will quite soon...

Comment: Re:no price? (Score 3, Informative) 64

by TheRaven64 (#47790969) Attached to: MIPS Tempts Hackers With Raspbery Pi-like Dev Board

Wouldn't it be just a matter of re-compiling your code though?

Assuming that your code doesn't do anything that is vaguely MIPS specific. If it is, then there is little benefit in using MIPS32r2 now - ARMv7 is likely to be closer than MIPS32r2 to MIPS32r6 in terms of compatibility with C (or higher-level language) source code compatibility.

I love MIPS and, that is the case in large part, because of its current instruction set. It seems like a bad idea to mess with the current instruction set and break backward compatibility. Why did they decide to do that?

Basically, because the MIPS ISA sucks as a compiler target. Delay slots are annoying and provide little benefit with modern microarchitectures. The only way to do PC-relative addressing is an ugly hack in the ABI, requiring that every call uses jalr with $t9 in the call, which means that you can't use bal for short calls. The lwl / lwr instructions for unaligned loads are just horrible and introduce nasty pipeline dependencies. The branch likely instructions are almost always misused, but as they're the only way of doing a branch without a delay slot there's often no alternative.

+ - Reformatting a Machine 125 Million Miles Away->

Submitted by Anonymous Coward
An anonymous reader writes "NASA's Opportunity rover has been rolling around the surface of Mars for over 10 years. It's still performing scientific observations, but the mission team has been dealing with a problem: the rover keeps rebooting. It's happened a dozen times this month, and the process it a bit more involved than rebooting a typical computer, taking a day or two to get back into operation every time. To try and fix this, the Opportunity team is planning a tricky operation: reformatting the flash memory from 125 million miles away. "Preparations include downloading to Earth all useful data remaining in the flash memory and switching the rover to an operating mode that does not use flash memory. Also, the team is restructuring the rover's communication sessions to use a slower data rate, which may add resilience in case of a reset during these preparations." The team suspects some of the flash memory cells are simply wearing out. The reformat is scheduled for some time in September."
Link to Original Source

Comment: Re:no price? (Score 3, Informative) 64

by TheRaven64 (#47790579) Attached to: MIPS Tempts Hackers With Raspbery Pi-like Dev Board
There's no price yet because they're giving away the first production run to people who are going to do interesting things with them. Unfortunately, this is a really bad time to do anything MIPS related (and I say this as someone who hacks on a MIPS IV compatible softcore and the LLVM MIPS back end). Imagination has just released the MIPS64r6 and MIPS32r6 specs. These are the biggest revisions to the MIPS ISA since MIPS III, which introduced 64-bit support. They've removed a load of legacy crap like the lwr and lwl instructions and the branch-likely instruction family and added things like compact (no delay slot) branch instructions, the requirement that hardware supports unaligned loads and stores (or, at least, that the OS traps and emulates them), and added much better support for PC-relative addressing. The result is a nice ISA, which is not backwards compatible with MIPS32r2 or MIPS64r2, the ISA that these boards use. Any investment in software for MIPS now is going to be wasted when products with the new ISA come out.

Comment: Re:*drool* (Score 1) 161

by TheRaven64 (#47790089) Attached to: Intel's Haswell-E Desktop CPU Debuts With Eight Cores, DDR4 Memory
For building big C++ projects, as long as the disk (yay SSDs!) can keep up, you can throw as many cores as you can get at the compile step and get a speedup, then sit dependent on single-thread performance for the linking. I got a huge speedup going from a Core 2 Duo to a Sandy Bridge quad i7, then another noticeable speedup going to a Haswell i7 in my laptop. The laptop is now sufficiently fast that I do a lot more locally - previously I'd mostly work on a remote server with 32 cores, 256GB of RAM (and a 3TB mirrored ZFS array with a 512GB SSD for ZIL and L2ARC), but now the laptop is only about a factor of 2 slower in terms of build times, so for developing individual components (e.g. LLVM+Clang) I'll use the laptop and only build the complete system on the server.

Comment: Re:A basic land line (Score 3, Informative) 573

by TheRaven64 (#47790061) Attached to: Ask Slashdot: What Old Technology Can't You Give Up?
There are several nice features of a landline, but they can't (in the UK, at least) compete on price. The line rental alone for a landline costs more than I spend on calls on my mobile (pre-pay, no contract, no monthly fees). Calls from my mobile are 3p/minute, a landline is £16/month. I'd need to spend almost 9 hours on the phone each month before I spent as much on my mobile as a landline would cost me before I even made any calls. And then, for the kicker, the calls from the landline cost 9p/min (+15p setup) for calls to other landlines or 12p/min (+15p setup) for calls to mobiles. There's no possible justification for calls from the landline costing 3-4 times as much as calls from the mobile on top of the extortionate line rental. If I wanted to pay BT even more, for another £3 I could get free evening and weekend calls to landlines, but calls to mobiles would still be the same price. For £7.50 on top of the line rental, I'd get free calls to landlines, and calls to mobiles would only be twice the cost of my mobile. Almost everyone I call has a mobile though, so in exchange for paying BT an amount equivalent to about 12 hours of calls on my mobile per month, I could then pay double per minute what I pay for calls on my mobile with no line rental.

Comment: Re:isn't x86 RISC by now? (Score 1) 156

by Guy Harris (#47789305) Attached to: Research Shows RISC vs. CISC Doesn't Matter

The AMD-64 architecture - is that also register limited?

With 16 GPRs, it has fewer registers than all the major RISC architectures other than 32-bit ARM, just as the 32-GPR System/3x0 (including its 64-bit z/Architecture version) does. It's less register-limited than x86, but that's not setting the bar very high. (Note that IBM recently added instructions to z/Architecture that do arithmetic on the upper 32 bits of the GPRs; that suggests that there's some register pressure with only 16 GPRs, although if they still have to make use of base registers, even with PC-relative branches, that might add some additional pressure that x86-64 doesn't have.)

Or did AMD toss something like 32-64 program accessible registers @ the problem?

No, they didn't; x86-64 has, as noted, only 16.

And if they did, would Intel have limited theirs?

Limited their what?

Comment: Re:Please... (Score 2) 85

by hairyfeet (#47789251) Attached to: Mozilla To Support Public Key Pinning In Firefox 32
Try Pale Moon friend. Its based on FF so you can keep your plugins, has a native 64bit build, oh and the best part NO STUPID NEW UI, in fact the devs have stated they will NOT be going to the new UI PERIOD. its fast, stable, works so well in fact I've started using it as my default browser even over my beloved Comodo Dragon because its even snappier, just a really great browser all around.

Comment: Re:Simple (Score 4, Interesting) 573

by hairyfeet (#47789099) Attached to: Ask Slashdot: What Old Technology Can't You Give Up?

DVDs. The reasons why is they are cheap, easy to transport, and can hold a lot of data. With DVDs I can hand somebody 4GB+ of data for 15c including the sleeve, and when you can't predict how well or reliable their net is? That comes in REAL handy.

So the pundits can talk cloud this and cloud that but as long as I can get 'em I'm gonna be using DVDs. Hell if I had my way I'd still be using Lightscribe, but now that HP has pulled the plug its getting harder and harder to find new burners with LS. Sucks as it worked quite nicely.

"Call immediately. Time is running out. We both need to do something monstrous before we die." -- Message from Ralph Steadman to Hunter Thompson