Forgot your password?

Comment: Re:why the focus on gender balance? (Score 1) 362

by Theovon (#47782941) Attached to: Why Women Have No Time For Wikipedia

No, it's a real problem here. Wikipedia is all about (1) information about the world, and (2) a neutral perspective on that information. Women do have a slightly different perspective, focusing on different information and different aspects of information. Including those additional perspectives will make wikipedia content more complete and also more accessible to female readers.

Comment: CISC instruction sets are now abstractions (Score 1) 136

by Theovon (#47778545) Attached to: Research Shows RISC vs. CISC Doesn't Matter

And actually so is RISC to a degree on POWER processors.

Back in the 80's going RISC was a big deal. It simplified decode logic (which was a more appreciable portion of the circuit area), reduced the number of cycles and logic area necessary to execute an instruction, and was more amenable (by design) to pipelining. But this was back in the days when CISC processors actually directly executed their ISAs.

Today, CISC processors come with translation front-ends that convert their external ISA into a RISC-like internal representation. It's on-line dynamic binary translation. Now, instructions are broken down into simpler steps that are more amenable to pipelining and out-of-order scheduling. CISC processors don't execute CISC ISAs and therefore don't suffer from their drawbacks.

It has occurred to me that this could be taken to its logical extreme. ISAs could be made entirely abstract and optimized to be used that way, along with optimizing them for reasonably efficient translation. You get the benefits of microops and the benefits of a CISC ISA (more compact code). Abstract ISAs make it easier to extend functionality in a backward-compatible way too. And unlike x86, we can shed some of the deadweight and also go to all 3-operand instructions, which have some benefits. Decoupling the ISA from the execution engine, we could get even more performance and energy efficiency than Intel does.

With a processor like Haswell, the logic area dedicated to translation is very small, which is why it doesn't matter much. On the other hand, with something like Atom, it occupies a more substantial portion of the total, making the translation (basically, elaborate decode logic) a buden on die area and therefore power consumption.

So it's not really appropriate to say it doesn't matter. It MOSTLY doesn't matter, because most of the drawbacks of CISC have been overcome. The fact that we're using an out-dated CISC ISA for x86, however, has drawbacks of having to support rare and excessively complex instructions, a plethora of addressing modes, and only having two operands per instruction.

Comment: Re:Particle state stored in fixed total # of bits? (Score 1) 242

by Theovon (#47768473) Attached to: Fermilab Begins Testing Holographic Universe Theory

IIRC, this isn't actually a paradox. One twin underwent acceleration, which leads to a temporal discontinuity. The other twin stayed in place. I guess if you have two clocks, and you accelerate one away from the other, you should be able to tell which one accerated and which didn't.

Comment: Science as religion (Score 1) 511

by Theovon (#47767167) Attached to: Limiting the Teaching of the Scientific Process In Ohio

Not teaching the scientific process may just make things worse. Doubt is a fundamental tennet of science, but many religious people (e.g. Kirk Cameron and his ilk) feel that they were "just told to believe this stuff" when they were in school. Without knowledge of the process that led to this knowledge, students will just start to treat science as an alternate bad religion or something.

Now, many kids handle uncertainty poorly, so this has to be handled carefully, but I think it's critical that science be tought as "this is the best explanation we have." Now, basically everything taught in high school is so well established (misrepresentations not withstanding), so we can explain that what they're being taught is consistent with mountains of evidence. But with the key factor that this stuff, at one time in the past, was cutting edge knowledge and did deserve to be taken with a big grain of salt. This can be expressed in terms of the history and evolution of particular sciences. We understood A at this time, and then someone discovered something, and views shifted accordingly to B. See how new evidence lead to a BETTER understanding through the scientific process??? What we're learning now has pretty well been beaten into submission, but understand that questioning assumptions is an important thing for people to learn.

Comment: This is a non-story (Score 0) 133

by Theovon (#47764931) Attached to: Time Warner Cable Experiences Nationwide Internet Outage

If we didn't hate TWC for other reasons, this would be dismissed as a quickly-corrected outage resulting from human error during some maintenance operation. But since we hate TWC, people make a big deal out of it and declare conspiracy and yell about bad customer service.

Nothing to see here, people.

Comment: Particle state stored in fixed total # of bits? (Score 1) 242

by Theovon (#47764449) Attached to: Fermilab Begins Testing Holographic Universe Theory

In special relativity, we find out that our velocity through spacetime is actually constant. If you move though space faster, you necessarily move through time more slowly.

So I'm wondering if information about particles is somehow limited to a specific amount of information. If you have more bits of precision about one thing, then the certainty about some other property is necessarily weaker because it doesn't get as many of the total number of bits something can have. Can we work out the number of bits? We need bits for position, bits for momentum, bits for other quantum mechanical properties, etc.

I'm wondering if perhaps superposition is a result of the number of bits for a given property (like spin) going to zero because they were required to increase the precision of something else. For that matter, I wonder if particles can share/trade bits, so that sometimes particles have no bits (like when they get absorbed). And maybe a body made up of particles has bits shared kinda like how a metal's conduction band is shared among all the atoms. Maybe that is the way force carriers act... trading bits. MAYBE the whole universe simply has a total number of bits, which are divided up as necessarily among the particles. And really particle interactions are just bits (and their values) being traded around within a vast amorphous ocean of bits. In that case, particles are an illusion; they're an emergent property (from our perspective) of the varying association among bits.

Comment: Re:Static scheduling always performs poorly (Score 1) 125

Peer-reviewed venues don't reject things that are too novel on principle. They reject them on the basis of poor experimental evidence. I think someone's BS'ing you about the lack of novelty claim, but the lack of hard numbers makes sense.

Perhaps the best thing to do would be to synthesize Mill and some other processor (e.g. OpenRISC) for FPGA and then run a bunch of benchmarks. Along with logic area and energy usage, that would be more than good enough to get into ISCA, MICRO, or HPCA.

I see nothing about Mill that should make it unpublishable except for the developers refusing to fit into the scientific culture, present things in expected manners, write using conventional language, and do very well-controlled experiments.

One of my most-cited works was first rejected because it wasn't clear what the overhead was going to be. I had developed a novel forward error correction method, but I wasn't rigorous about the latencies or logic area. Once I actually coded it up in Verilog and got area and power measurements, along with tightly bounded latency statistics, then getting the paper accepted was a breeze.

Maybe I should look into contacting them about this.

Comment: Re:Static scheduling always performs poorly (Score 1) 125

by Theovon (#47664251) Attached to: NVIDIAs 64-bit Tegra K1: The Ghost of Transmeta Rides Again, Out of Order

I looked at the Mill memory system. The main clever bit is to be able to issue loads in advance, but have the data returned correspond to the time the instruction retires, not when it's issued. This avoids aliasing problems. Still, you can't always know your address way far in advance, and Mill still has challenges with hoisting loads over flow control.

Comment: Re:Static scheduling always performs poorly (Score 1) 125

by Theovon (#47662733) Attached to: NVIDIAs 64-bit Tegra K1: The Ghost of Transmeta Rides Again, Out of Order

I've heard of Mill. I also tried reading about it and got bored part way through. I wonder why Mill hasn't gotten much traction. It also bugs me that it comes up on regular google but not google scholar. If they want to get traction with this architecture, they're going to have to start publishing in peer-reviewed venues.

Comment: Re:Static scheduling always performs poorly (Score 1) 125

by Theovon (#47662691) Attached to: NVIDIAs 64-bit Tegra K1: The Ghost of Transmeta Rides Again, Out of Order

Prefetching instructions hundreds of cycles ahead of time have to be highly speculative and therefore are likely to pull in data you don't need along with missing out on some data you do need. If you can improve the cache statistics this way, you can improve performance, and if you avoid a lot of LLC misses, then you can massively improve performance. But cache pollution is as big a problem as misses because it cause conflict and capacity misses that you'd otherwise like to avoid.

Anyhow, I see your point. If you can avoid 90% of your LLC misses by prefetching just into a massive last-level cache, then you can seriously boost your performance.

Comment: Static scheduling always performs poorly (Score 5, Informative) 125

I'm an expert on CPU architecture. (I have a PhD in this area.)

The idea of offloading instruction scheduling to the compiler is not new. This was particularly in mind when Intel designed Itanium, although it was a very important concept for in-order processors long before that. For most instruction sequences, latencies are predictable, so you can order instructions to improve throughput (reduce stalls). So it seems like a good idea to let the compiler do the work once and save on hardware. Except for one major monkey wrench:

Memory load instructions

Cache misses and therefore access latencies are effectively unpredictable. Sure, if you have a workload with a high cache hit rate, you can make assumptions about the L1D load latency and schedule instructions accordingly. That works okay. Until you have a workload with a lot of cache misses. Then in-order designs fall on their faces. Why? Because a load miss is often followed by many instruction that are not dependent on the load, but only an out-of-order processor can continue on ahead and actually execute some instructions while the load is being serviced. Moreover, OOO designs can queue up multiple load misses, overlapping their stall time, and they can get many more instructions already decoded and waiting in instruction queues, shortening their effective latency when they finally do start executing. Also, OOO processors can schedule dynamically around dynamic instruction sequences (i.e. flow control making the exact sequence of instructions unknown at compile time).

One Sun engineer talking about Rock described modern software workloads as races between long memory stalls. Depending on the memory footprint, a workload could spend more than half its time waiting on what is otherwise a low-probability event. The processors blast through hundreds of instructions where the code has a high cache hit rate, and then they encounter a last-level cache miss and and stall out completely for hundreds of cycles (generally not on the load itself but the first instruction dependent on the load, which always comes up pretty soon after). This pattern repeats over and over again, and the only way to deal with that is to hide as much of that stall as possible.

With an OOO design, an L1 miss/L2 hit can be effectively and dynamically hidden by the instruction window. L2 (or in any case the last level) misses are hundreds of cycles, but an OOO design can continue to fetch and execute instructions during that memory stall, hiding a lot of (although not all of) that stall. Although it's good for optimizing poorly-ordered sequences of predictable instructions, OOO is more than anything else a solution to the variable memory latency problem. In modern systems, memory latencies are variable and very high, making OOO a massive win on throughput.

Now, think about idle power and its impact on energy usage. When an in-order CPU stalls on memory, it's still burning power while waiting, while an OOO processor is still getting work done. As the idle proportion of total power increases, the usefulness of the extra die area for OOO increases, because, especially for interactive workloads, there is more frequent opportunity for the CPU to get its job done a lot sooner and then go into a low-power low-leakage state.

So, back to the topic at hand: What they propose is basically static scheduling (by the compiler), except JIT. Very little information useful to instruction scheduling is going to be available JUST BEFORE time that is not available much earlier. What you'll basically get is some weak statistical information about which loads are more likely to stall than others, so that you can resequence instructions dependent on loads that are expected to stall. As a result, you may get a small improvement in throughput. What you don't get is the ability to handle unexpected stalls, overlapped stalls, or the ability to run ahead and execute only SOME of the instructions that follow the load. Those things are really what gives OOO its advantages.

I'm not sure where to mention this, but in OOO processors, the hardware to roll back mispredicted branches (the reorder buffer) does double-duty. It's used for dependency tracking, precise exceptions, and speculative execution. In a complex in-order processor (say, one with a vector ALU), rolling back speculative execution (which you have to do on mispredicted branches) needs hardware that is only for that purpose, so it's not as well utilized.

Comment: Needed innovation: SLIM JAVA DOWN (Score 1) 371

by Theovon (#47637375) Attached to: Oracle Hasn't Killed Java -- But There's Still Time

Right now, if I want to ship an app that uses Java 8 features, I have to bundle an extra 40 megs of runtime. This is because Java 8 isn't yet the default. An extra 40 megs is stupid for simple apps. The runtime is an order of magnitude larger than the application. That's stupid.

If Java wants to innovate, they can find a way to maintain all the existing features and backward compatbility while using less space. That would be a worthy project and worth while for Java 9. They can make things smaller and perhaps even faster by rewriting things that are overly bloated.

Comment: The root of the problem is culture & social cl (Score 3, Interesting) 514

by Theovon (#47569505) Attached to: Jesse Jackson: Tech Diversity Is Next Civil Rights Step

For some reason, Americans have developed a stereotype of "white" and "black" that is related far more to social class than anything else. When you say "white," we imagine someone from the middle class. When you say "black," we imagine someone from lower socioeconomic status. How many blacks are in the middle class, I'm not sure, but as for whites in lower classes, we have them coming out our ears. While we may have millions of blacks who live in ghettos, we have 10 times as many whites living in trailor parks.

Because of our confusion between ethnicity and social class, we end up with things like Dave Chappelle's "Racial Draft":
While amusing, it highlights the real problem, and this false stereotype is widespread throughout American culture.

I recall an interview with Bill Cosby, talking about educational advancement among black children. Peers discourage each other from studying because it's "acting white." When in fact it is "acting middle class," because this same kind of discouragement occurs among lower class whites as well. As long as education is not valued within any group, that group will have difficulty being equally represented in white collar industries.

What we have to work out to explain the disparity between population demographics and white collar job demographics is the proportions of the underrepresented groups who discourage education. People like Jesse Jackson want to make this all out to be the result of prejudice on the basis of genetics or skin color. Honestly, I think we're long past that. There are still plenty of racist bastards out there, but in general, we do not have pink people acting intentionally or unconsciously to undermine the advancement of brown people when it comes to getting college degrees.

It's not PC to talk about genetic differences, but genetics is interesting. Geneticists have identified differences between different ethnic groups, and they have correlated them with some minor differences in physical and cognitive adaptations. Things like muscle tone, susceptibility to certain diseases, social ability, and other things have been correlated to a limited degree with variation in human DNA. But the average differences between genetic groups are miniscule compared to their overlap (statistically, we have very small mu / sigma for basically any meaningful measurable characteristic).

Thus I can only conclude that correcting any disparities must come from within. Regulating businesses won't do any good, because unqualified minorities will end up getting unfairly hired and promoted. We have to start with the children and get them to develop an interest in science and math. If Jesse Jackson wants to fix this problem, he need to learn science and math and start teaching it. I assure you, even at his age, he has that capability, if he just cared enough to do it. Unfortunately for him, if he were to corrupt himself with this knowledge, he would find himself taking a wholly different approach than the "we're victims" schtick he's played most of his life. Personally, I prefer the "the universe is awesome" philosophy held by Neil deGrasse Tyson. He's one of my biggest heroes, having nothing to do with his skin tone.

One last though: I'm sure someone will find something racist in what I have said. Either that or I'm being too anti-racist and appear like I'm overcompensating. There are also aspects of these social issues I know nothing about. I'm just writing a comment on Slashdot that is about as well-informed as any other comment. One thing people should think about in general is whether or not they have hidden prejudices. It's not their fault, having been brought up in a culture that takes certain thing for granted. Instead of burying our heads in the sand, we should be willing to admit that we probably do have subconscious prejudices. That's okay, as long as we consciously behave in a way that is fair to other human beings, regardless of race, gender, sexual orientation, autism, or any other thing they didn't choose to be born with (and plenty of things they have chosen, because it's people's right to choose).

Can't open /usr/fortunes. Lid stuck on cookie jar.