Catch up on stories from the past week (and beyond) at the Slashdot story archive

 



Forgot your password?
typodupeerror
×

Comment Re:Quothe the raven, "Forevermore". (Score 1) 205

Don't be jackass. He's obviously eccentric. In the huge percentage case, he'll find nothing but personal amusement from this, at a real cost of time and money. But if he discovers something cool, you'll be lining up to slob his knob. He's playing a lottery with his time and money, but he's buying a ticket for society, not himself.

Comment Re:Quothe the raven, "Forevermore". (Score 1) 205

It's very fair to point out that we could be missing something. In fact, we almost assuredly are! But it seems very unlikely to be something as fundamental as light.

If you were a member of a species that never used light in any way, you would eventually run into some phenomena that would require the experimental testing of this. For instance, your species would have dealt with heat through antiquity, and something very hot can transmit heat over a distance- but if you assumed this was all a fluid, or the movement of small particles, you would still be left with a big experimental gap between your model and reality- because some heat is transferred as infrared, even through a vacuum where conduction and convection can't explain it. The gaps in science would become even more apparent as you got more advanced. You would need a concept of radiation to explain almost anything.

So we are left with :
1)- Phenomena that primarily exist far away- for instance, black holes could literally have any amount of strange shit going on in them, or actually be organisms, or whatever else. It's not your first rational guess, but we can't preclude as much as we would like, because we only have some of the data about them that we would like.
2)- Phenomena that exist in trivial measure, but could be important. For instance, we don't have the ability to perceive neutrinos, but they ended up being a detectable and real particle/something. Something like this could be waiting for the particle physicists, we can only hope.

These can be really important, but the problem is that things that are tiny or far away are by no means guaranteed to have huge effects on humans. Mostly, we are interested in creating, preserving, and ending life, so we mostly try to push our discoveries in that direction- and not every phenomena seems to be useful towards any of those goals.

Comment Re:slashdot and languages (Score 1) 336

So, I'll give you this: There's a lot of bullshit that can go on that I wasn't aware of, so this has been educational.

However, my answer doesn't change.

Class B will end up with a trivial constructor. The built in types "int" and "float" will be mapped without initialization.

Class D will first call the constructor for Class B, which will do the same thing (nothing). Then its named constructor will run, which has no effect on the inherited member variables.

I *think* you are trying to show a case where D leaves them uninitialized and B gives them the value initializer, but I'm not sure.

And while this has cost me like an hour and a half I meant to be typing bullshit to buddies about shooting down video game spaceships with purple lasers instead of this, it's been interesting.

Still- the case is rather contrived. In practice, if you want behavior like a struct, where stuff gets no values so you don't waste your time memsetting, C++ gives you that with a struct that looks like C, and in the case where you want to have useful values ready to go, an initializer list or set of assigns gives you that.

But holy SHIT the internet has a lot of words on this, and they've changed betwixt versions. As a programmer, the only safe thing is mostly what I see done: use the exact C notation when you want that, and initialize everything carefully when you are trying to create an object sensibly. Mostly, you would want to do this because if you DID make your class all tuned to work exactly right, the precise behavior has changed from 98 to 03 to 11, so it definitely shouldn't be relied on.

So while I disagree with your precise point, and I still claim that C++ constructors are, in practice, not confusing (because people don't normally leave anything uninitialized unless they are very deliberate about that), I wholeheartedly agree with your overall point that this is a wildly complex topic that is not only preposterous to grasp in full, but subject to change by committee at any time.

Comment Re:slashdot and languages (Score 1) 336

Holy shit, are you for real?

The answer is, if you use an uninitialized variable, the results are undefined.

This isn't a C++ problem. You got some addresses, never stuck some data in them, and are asking about them. If the language didn't let you do that, it would have some serious issues. In practice, you will generally find that this RAM is zeroed.

Your complaint is "I want a language that doesn't let me create uninitialized variables". But that's a shitty language, and it means that declaring a variable will implicitly instruct actions on the part of the code at run time. Fuck that.

Comment Re:Lol (Score 1) 248

"Do you really think 12 happy faces fit in the same space as 12 letter 'i'?"

Holy fucking shit, who cares? If this was done by LETTER WIDTH, we wouldn't see the problem- it would be converted to a graphic and pruned that way. Many characters are much SMALLER. M is bigger than i, etc. You are just so fucking wrong you are trying to fight. It was never about font display.

I found the smiley that is 4 bytes. It is as wide as a capital W, to within pixels... on this PC.

Assuming that a character that is four bytes long is four times as wide as a one byte character is just simply retarded, and you would NEVER have said it if you weren't trying so hard to be correct.

"And what you propose would split betwen a letter and a combining accent"

What YOU propose would do that too!! If your method landed on a the accent mark, it would turn it into ellipsis, stripping the circumflex or whatever from the letter. The fix for either would be trivial- you check the next character to see if it is a combining whatever, and then go to the one after that. Either way, you need to parse the goddamned string.

"Basically as soon as the words "N characters" come out of your mouth you are wrong."

NO FUCKING JUST NO HOLY FUCKING SHIT

Your thing works with a specific encoding, and makes some messages dramatically shorter than others. Doing it by characters instead of byte length isn't wrong, it's THE ONLY ACTUALLY CORRECT WAY!!!!!

UTF16:
You're correct about the surrogate order, and thank you. I'm not working with UTF-16 atm.

Comment Re:I am amazed (Score 1) 248

Agree totally, but it would be nice to add some utf-8 specific functions that will, in ONLY those cases, actually walk the data (vastly slower for very large inputs than an index) and generate data about the string as utf-8 instead of byte data. Maybe those already exist in C/C++/ whatever the kids laughably think will be the next thing over those.

Comment Re:Lol (Score 1) 248

"No you are wrong."

Pretty sure I'm not. We could just claim that way back and forth, but lets go over this:

Here's what you said:

"Go X bytes into the string. If that byte is a continuation byte, back up. Back up a maximum of 3 times. This will find a truncation point that will not introduce more errors into the string than are already there."

Here's what I said:

"This only works for UTF-8, and theoretically fails with the older type of UTF-8 (when you could have up to 6 bytes, by spec). So you probably will have to go through it character by character, not byte by byte, exactly as Brons said."

So pretend you have a 12 character display. Your method, for UTF-8:
> Checks to see if the input is 12 or less bytes, and displays it fine (this works)
> If not, it goes to that 12th byte, then checks it to see if it is a continuation byte (a byte which, when ANDed with 0xC0, is equal to 0x80)
> If it is a continuation byte, and we haven't seen three in a row yet, increment the number seen, and back up one byte.
> If we found a non-continuation byte or we have seen three continuation bytes in a row, then what we are looking at must be a starter byte.
> Write four bytes beginning with the overwriting the starter byte: 0xE2 0x80 0xA6 0x00 (ellipsis, null character)

With this method, you definitely could have left some garbage to the right of the null (if that null ate anything to the right of that), but that's ok because the null ends the stream (if it doesn't, you'll need to pad some more nulls). An alternate method that doesn't stamp the null is vastly worse, as if you were finding a two byte character to stamp the three byte elipsis into, you would have eaten the first byte of the NEXT multibyte sequence, leaving you with an illegal data stream, and no null to tell the next guy to stop.

But, anyway, this one works- like I said- but I claimed that it had two problems- "only for UTF-8" and "results in a VERY short message for some inputs". It also trivially fails for the pre-RFC-3629 UTF-8 standard, but I guess we are ok with that (that version can have up to five continuation bytes).

If your message was, lets say, 8 of the "smiling face with smiling eyes" emojis:
http://www.fileformat.info/inf...
(or equivalent 4 byte characters)

The algorithm of "go 12 bytes in" will skip past the first two entire "0xF0 0x9F 0x98 0x8A" sequences, landing on the "0x8A" one of the last one. The algorithm will detect that this is a continuation byte, and back up the max times (through the 0x98, and 0x9F), landing on (and stamping over) the 0xF0 initial byte. But this means that your output message is:

(happy face)(happy face)(ellipsis)

You took a 12 character display AND LIMITED IT TO TWO CHARACTERS. When in fact, the original message would have fit, if you did what Brons said.

Because you searched in N bytes, instead of doing what Brons said (and that you even fucking called "MORONIC"), you fucked your hypothetical user AND insulted the guy with the right answer at the meeting (or were at least rude to him, brusque, or superior without cause).

But, lets continue.

I also claimed that this "only works for UTF-8". This is pretty trivially true- you explicitly refer to "continuation bytes", which are definitely not present in all encoding methods. UTF-16 is either one or two 16-bit words, and these are not "continuation bytes". With such an input, you would go 2*N forward, and then check for if the word sequence found was whichever surrogate comes first in your byte ordering (ex, you might be looking to see if it is a high surrogate, and therefore the start of a character, if your byte stream has that ordering), and if not, back up one word to find the guaranteed start of character, and then stamp over that with your elipsis. This is the general equivalent of your UTF-8 solution, but you still dramatically shorten what your user can display, to five happy faces and an ellipsis for their 8 character message that would have fit just fine.

Unless the stream you are parsing is massive, such that byte-by-byting it would be costly, Brons has the correct solution. And a byte loop that spins over a string that fits in an iphone text message, stopping when it has seen N characters, for small N, certainly isn't in the unloopable universe.

You took the correct solution, out of all of slashdot, and shit in its mouth. So annoying.

Comment Re: Lol (Score 1) 248

In this case, the illegal UTF-8 sequence is the string after you have blown part of its funny foreign squiggle. He's saying that thing you call with your newly minted mangled string shouldn't fail.

Which is one way to solve it. I would argue, if the thing you calls mangles strings, sanitize its inputs so it doesn't get a string with a bad character (a unicode character of whatever format it uses internally, post-mangle). Ideally, you would never go off to lala land, no matter your input- even if the guy upstream mangled it to be invalid.

Comment Re:Lol (Score 1) 248

The string doesn't just need funny characters in it, it needs them at a precise position (and apparently not just any character will break, so it needs to have a particular expansion in whatever they used to encode their unicode). A test case would have solved it, but it doesn't sound like a reasonable test case to expect.

And yes, if you call a library that does some buggy truncation, you need to guard it on input.

Comment Re:Lol (Score 1) 248

This only works for UTF-8, and theoretically fails with the older type of UTF-8 (when you could have up to 6 bytes, by spec).

So you probably will have to go through it character by character, not byte by byte, exactly as Brons said. If you go N bytes into the string, and the string was just a ton of kanji, then you might truncate a VERY short message indeed- if you go looking 40 bytes in, you could be looking at a 10 character string or something for no reason, when your display would happily fit 40.

Comment Re:Lol (Score 1) 248

Yes, it is. Any input that will crash your library needs to be sanitized. You need to truncate the message on display, at the bad character. That doesn't mean you change the source message or render whatever folks use it to communicate unable to do that- you as the fucking programmer SANITIZE YOUR INPUT, because otherwise, you fuck the user.

Comment Re:what? (Score 3) 55

Blue light is higher energy than red light.

X-Rays are higher energy than any visible light.

Radio waves are lower energy than any visible light.

Gamma rays are higher energy than X-Rays (and all other photons, because past a point, we call everything a gamma ray)

http://en.wikipedia.org/wiki/E...

If you had enough energy to make one 350nm photon (a wavelength that just might be visible, maybe, as it is UV), you could instead make two 700nm photons with the same energy (which also might just barely be visible, as it is at the edge of infrared). More reasonably, if you had enough energy to make 3 blue photons, you could instead make 4 red ones with that same amount of energy.

http://en.wikipedia.org/wiki/V...
http://www.chemteam.info/Elect...

Slashdot Top Deals

Work without a vision is slavery, Vision without work is a pipe dream, But vision with work is the hope of the world.

Working...