Catch up on stories from the past week (and beyond) at the Slashdot story archive

 



Forgot your password?
typodupeerror
×

Comment Re: Lol (Score 1) 248

From that description it does sound like the string is still valid. However if the display is crashing on a certain sequence containing an ellipsis, I am not clear why you can't construct that string directly, rather than rely on the insertion of the ellipsis.

It does sound like they maybe rely on "sanitizing" but of a far more complex scheme that I was aware of. This is still wrong, maybe far worse, as they are detecting and rejecting patterns containing ellipsis and some other character, that is INSANE!!!. Any such work should be delayed until the VERY LAST moment. In this case their glyph layout should simply not crash on any possible arrangement of bytes or words in the incoming string. This is very much the same stupidity that I was ranting about for UTF-8. Nobody used to crash because you put mis-spelled words in your text and tried to print it. Apply the same logic to UTF-8 and Unicode. It is not hard and it seems really obvious, but for some reason Unicode turns some otherwise really smart programmers into total idiots.

Comment Re:I am amazed (Score 1) 248

I like that idea. You're right, it should be pretty efficient to implement, regardless of the string's backend encoding. And the value represented by the iterator will, by nature of being implemented as a pointer to a certain part in the string, be able to point to a glyph of arbitrary length (unlike a getter function with a fixed-length return type). Being an iterator it'll fit into all standard c++ libraries that take iterators.

It would be nice to have it be a random-access iterator so that you can jump to an arbitrary offset. There's a lot of optimizations they could do internally to help facilitate that. But obviously you still want to let programmers choose - by some means or another - whether they want such unicode optimizations (or unicode iteration, or so forth). Because while the overhead they'd impose wouldn't be huge, there still would be overhead.

Comment Re:Terraforming potential? (Score 1) 278

Except wait - we've got a phase change from gas to plasma in there, which almost certainly breaks their calculations badly.

Again, no, you don't. All of the particles are moving in the same direction. They're not hot. They're not slamming into each other and kicking electrons off.

Do you think if you had a spacecraft moving at 25.4 kilometers per second it would be plasma too?

Submission + - Journalist fools media into publishing chocolate weight loss story (io9.com)

dinfinity writes: "“Slim by Chocolate!” the headlines blared. A team of German researchers had found that people on a low-carb diet lost weight 10 percent faster if they ate a chocolate bar every day. [...] It was discussed on television news shows. [...] My colleagues and I recruited actual human subjects in Germany. We ran an actual clinical trial, with subjects randomly assigned to different diet regimes. And the statistically significant benefits of chocolate that we reported are based on the actual data. It was, in fact, a fairly typical study for the field of diet research. Which is to say: It was terrible science. The results are meaningless, and the health claims that the media blasted out to millions of people around the world are utterly unfounded."

Comment Not if they think they can get more work out of us (Score 1) 140

If this works, the monied and in-power will make this as illegal as LSD and heroin.

Not necessarily.

If the anti-aging drug(s) make people healthier, reducing the drain on the government pensions and enabling the government to push the retirement age out over the horizon, so the people will be working and taxed, they might prefer to have the drugs put into use.

Heck, they'd probably add them to the water.

Comment Re:I am amazed (Score 1) 248

Actually I think "Unicode strings" should be avoided completely.

They do not help at all in doing text manipulation, because Unicode code points are *not* "characters" or any other unit that users think about. This is due to combining characters and invisible characters such as bidi indicators. There is a prefix code unit that eats the next 2 letters and turns it into a country flag! It is a huge mess.

Far more important is they all lack the ability to store errors that are in a UTF-8 string in a lossless way. This means you cannot trust arbitrary 8-bit data to survive translation to "Unicode" and back. This has been the source of endless bugs and is the reason people can't use Python 3.0.

Comment Re:I am amazed (Score 1) 248

My recommendation is special interators on std::string. Something like this:


    for (utf16_interator i = string.begin(); i != string.end(); ++i) {
          int x = *i;
          if (x < 0) error_byte_found();
          else utf16_found(x);
    }

There would also be interators for UTF-32 (probably what you were thinking of as "Unicode" but a lot of Microsoft programmers think "Unicode" means UTF-16). And iterators for other normalization forms. In all cases these would return negative numbers or some value that cannot be confused with a code point for UTF-8 error bytes.

This would be very fast because you can find the next Unicode code unit or whatever in constant time. Any api where you can arbitrarily index a unit using an integer is not going to be constant, it will be linear with that integer. Iterators avoid this.

Comment Re:Lol (Score 1) 248

No you don't. You are demonstrating the typical moronic attempts to deal with UTF-8.

Here is how you do it:

Go X bytes into the string. If that byte is a continuation byte, back up. Back up a maximum of 3 times. This will find a truncation point that will not introduce more errors into the string than are already there.

BUT BUT BUT I'm sure you are sputtering about how this won't give you exactly X "characters". NOBODY F**KING CARES!!!! If you want the string to "fit" you should be *measuring* it, not saying stuff that has not been true on computers since the 1950's about "N characters fit". I bet you think a combining letter and accent should count as 2, huh?

And your display function should not crash because it was given a string with an error in it! Even if you stupidly inserted the ellipsis all it should do is draw a few error indicators before the ellipsis.

Comment Re: Lol (Score 2) 248

No, the problem is code that pretends that illegal UTF-8 sequences magically don't exist!

For some reason UTF-8 turns otherwise intelligent programmers into complete morons. Here is another example from Apple. Let me state some rules about how to deal with UTF-8:

1. Stop thinking about "characters"!!!! This is a byte stream. The ONLY reason to think about a "character" is because you are DRAWING it on a display designed for a human to read, and humans do think about "characters". All other software either does not care, or is concerned with far more complex patterns (such as regexp and editors that deal with words and sentences), these second ones are not helped at all by an intermediate translation.

2. It is TRIVIAL to detect that the byte sequence you are looking at is not a valid UTF-8 character. In this case draw a replacement for exactly ONE byte and then try the next byte to see if it is a valid sequence. Do not skip more. There must be one error per byte so that the maximum number of good characters is preserved and so that a sequence with errors can be parsed bidirectionally without looking more than a few bytes ahead, and so that it is possible to search for error patterns. It also means there are only 128 different errors, not millions.

3. NEVER "translate to Unicode" (ie UTF-16) because this will be a lossy conversion of these invalid sequences and thus you have not preserved the original data. I'm sorry but Microsoft really screwed us here. Best recommendation is to write a wrapper around the filesystem calls and translate from UTF-8 to UTF-16 at the last moment, using U+DCxx as a translation for the error bytes (this is lossy but filenames already are, due to case independence, Apple's normalization, and even on Unix where "./foo" and "foo" are the same file).

This is blatantly obvious if you substitute "words" for "characters" and imagine how you would write a program to deal with text strings. Words are also composed of multiple bytes in a row. For some reason nobody seems to crash on misspelled words, and they manage to concatenate and split strings and make whole file systems and diff programs and all kinds of other fancy text manipulation without having to translate the text so that each word is a fixed-sized integer. Amazing!

Submission + - The Tricky Road Ahead for Andriod Gets Even Trickier 1

HughPickens.com writes: Farhad Manjoo writes in the NYT that with over one billion devices sold in 2014 Android is the most popular operating system in the world by far, but that doesn't mean it's a financial success for Google. Apple vacuumed up nearly 90 percent of the profits in the smartphone business which prompts a troubling question for Android and for Google: How will the search company — or anyone else, for that matter — ever make much money from Android. First the good news: The fact that Google does not charge for Android, and that few phone manufacturers are extracting much of a profit from Android devices, means that much of the globe now enjoys decent smartphones and online services for low prices. But while Google makes most of its revenue from advertising, Android has so far been an ad dud compared with Apple’s iOS, whose users tend to have more money and spend a lot more time on their phones (and are, thus, more valuable to advertisers). Because Google pays billions to Apple to make its search engine the default search provider for iOS devices, the company collects much more from ads placed on Apple devices than from ads on Android devices.

The final threat for Google’s Android may be the most pernicious: What if a significant number of the people who adopted Android as their first smartphone move on to something else as they become power users? In Apple’s last two earnings calls, Tim Cook reported that the "majority" of those who switched to iPhone had owned a smartphone running Android. Apple has not specified the rate of switching, but a survey found that 16 percent of people who bought the latest iPhones previously owned Android devices; in China, that rate was 29 percent. For Google, this may not be terrible news in the short run. If Google already makes more from ads on iOS than Android, growth in iOS might actually be good for Google’s bottom line. Still, in the long run, the rise of Android switching sets up a terrible path for Google — losing the high-end of the smartphone market to the iPhone, while the low end is under greater threat from noncooperative Android players like Cyanogen which has a chance to snag as many as 1 billion handsets. Android has always been a tricky strategy concludes Manjoo; now, after finding huge success, it seems only to be getting even trickier.

Submission + - Judge Classifies as Class Action An Email Scanning Lawsuit Against Yahoo (itworld.com)

itwbennett writes: A lawsuit that alleges Yahoo’s email scanning practices are illegal can proceed as a class action complaint, a development that will shine the spotlight on the Yahoo Mail use of messages’ content for advertising purposes. Plaintiffs allege that emails sent to Yahoo Mail users by people who do not have Yahoo Mail accounts are scanned by Yahoo in violation of federal and California wiretapping laws.

Submission + - New Technique to Develop Single Molecule Diode

William Robinson writes: Under the direction of Latha Venkataraman, associate professor of applied physics at Columbia Engineering, researchers have designed a new technique to create a single-molecule diode, that has rectification ratio as high as 250, and 'ON' current as high as 0.1 microamps. The idea of creating a single-molecule diode was suggested by Arieh Aviram and Mark Ratner who theorized in 1974, which has been the 'holy grail' of molecular electronics ever since its inception to achieve further miniaturization, because single molecule represent the limit of miniaturization.

Comment Re:Terraforming potential? (Score 1) 278

First off, you're misusing temperature. You don't call it heat if all of the particles are moving in the same direction and unionized, you just call it "wind". It only becomes heat if that windstream suddenly slams into a non-moving solid surface and becomes instantly thermalized (but of course even then that would be a very short-lived event as it would correspond with a pressure rise and the deflection of the stream behind the high-pressure zone). Additionally, nor would that be the windspeed touching the surface as, obviously, wind forms boundary layers.

Secondly, hundreds of km/s from Venus escape to Mars intercept? That doesn't at all correspond to any delta-V chart I've ever seen.

Slashdot Top Deals

Machines have less problems. I'd like to be a machine. -- Andy Warhol

Working...