ThePhilips - Slashdot User

Comment Re:Too Many Ewoks (Score 1) 422

by ThePhilips on Friday January 23, 2015 @02:55PM (#48886807) Attached to: Disney Turned Down George Lucas's Star Wars Scripts

Not sure if this is insightful, informative or funny. Or all three.

Lucas scripts for the prequels sucked badly. But modern Disney scripts are not much better.

Comment Re: Perl, my favorite language is rated higher... (Score 1) 386

by ThePhilips on Wednesday January 21, 2015 @08:59AM (#48864197) Attached to: Is D an Underrated Programming Language?

[...] in the same way broken Engrish can get a message across.

Which is kind of the whole point of the programming languages: getting the message across. Telling computer to do the job.

It's still not the same as knowing the language, and using it well.

Knowing language != being able to write good program in it. Real programmer can write good program in any language.

And "using the language well" is just meaningless statement, highlighting the modern focus on the form rather than content. AKA good program as beautiful programs vs. good program as working/useful program.

Why write several lines that do the job - when we can create beautiful class hierarchies spread across numerous libraries, use modern concepts and paradigms to accomplish the same?

Comment Re:"Half Baked"? (Score 1) 243

by ThePhilips on Tuesday January 20, 2015 @01:33PM (#48857391) Attached to: Could Tizen Be the Next Android?

The fact is hasn't been on any significant number of devices in the real world would be a big flag, [...]

There is a significant number of Tizen devices in the real world. Several car manufacturers use it for IVI (in-vehicle infotainment) systems.

[...] there's alot of refinement that comes from *actual* use in the wild that you don't get from lab development.

MS Windows? G+? Refinement from the actual use in the wild: zero.

Comment Re:It will be a mistake (Score 2) 243

by ThePhilips on Tuesday January 20, 2015 @01:24PM (#48857291) Attached to: Could Tizen Be the Next Android?

Samsung would never become Tizen-only shop. They would go on making all possible devices, Android and WinPho included.

Otherwise, Nokia lost its #1 position because they have failed to adapt their devices to new markets. That is precisely what Samsung tries to avoid with the Tizen. Since there is no Google to set the rules what can and cannot be an Android device and OS, Samsung (and others) can tweak Tizen to fit pretty much any device they like. After all, Tizen is larger than Samsung and is not exclusively a phone OS.

Comment Re:Well if that happens, it'll be bye bye Samsung. (Score 1) 243

by ThePhilips on Tuesday January 20, 2015 @01:11PM (#48857135) Attached to: Could Tizen Be the Next Android?

How much an Android device is worth without the Google Play store? Not much.

If you disable all the accounts/etc, pure Samsung device wins hands down over the Google one, because most Google apps these days are crap. While Samsung's apps for the most part try to be simply useful.

And if you wish to have the up-to-date with the Android version, the Google account and oftentimes G+ account are strongly suggested. Because occasionally Google forgets that not everybody is a Google/G+ user and in absence of the account some basic features simply do not work. That was experience of some of my friends who updated their Galaxies to the Lollipop, only to find that some stuff simply doesn't work because they are do not have the G+ account.

Comment Generic Programming (Score 2) 42

by ThePhilips on Monday January 19, 2015 @03:39PM (#48851481) Attached to: Interviews: Alexander Stepanov and Daniel E. Rose Answer Your Questions

I think generic programming is destined to be second class citizen. IMO the human side of the problem is not the biggest out.

The biggest one is that there is no compiler which can untangle such code and generate something efficient out of it.

This is basically the attic where all the "everything is an object" languages have stuck. On one side, development is made easier in many places because everything is an object. On the other side, the performance and memory consumption degrade, to the point where developers end up counting and optimizing object usage of every code line and function. Because there are no compilers which are capable of deducing from the human readable code what the hell developer actually wanted to accomplish.

That brings me to the other bigger issue. Most concepts and paradigm, including the generic programming, which occupy the minds of researchers do NOT help solve the ultimate problem of the computer programming: efficiently communicating with the CPU, efficiently telling it what needs to be done.

If developer is a writer, CPU is a reader, and assembler is the spoken language, then most simple programs, with 10-50K instructions, are close the novel size. Think of it: the usual "Hello World" program, to a CPU is close in size to the novel! And if it's in an interpreted language, the CPU might end up reading a whole frigging roman, just to deduce that all what developer wanted was to print the "Hello World" on screen.

Comment Re:utf-32/ucs-4 (Score 1) 165

by ThePhilips on Monday January 12, 2015 @04:58AM (#48791489) Attached to: NetHack Development Team Polls Community For Advice On Unicode

Everybody has already settled on the little-endian presentation.

What makes you think this? There are plenty of old Motorola architecture based systems still in legacy environment use, preserved for stable scientific or business computing environments.

Man, I come from the BE world. You do not need to tell me that there is still abundance of the BE hardware.

And there is a significant amount of new, bi-endian hardware being produced now,

Most modern CPUs I had to deal with, except the Intel, are bi-endian. BUT. Most (by model number) are used in BE mode. (But since ARM also has settled on the LE, now it is effectively a LE world.)

Yet.

1st. The endianness of the CPU is not related to the endianness of an data exchange format.

2ns. The endianness of the data exchange format does not relate to the internal presentation of the data in the application's memory.

I'm afraid I have quite a lot of experience with Unicode compatibility and cross compatibility. Frankly, for a multi-platform tool like Nethack, I'd stay with the 8-bit, one byte, extremely stable 'POSIX' standard.

You folks lump it all together. There are two sides to it: internal presentation and external conversion.

For internal presentation, one goes with whatever makes your life as developer easier. UCS-4 is definitively an option. UTF-8 (aka "I do not care, just passing data through") is also OK. Most applications fall into the later category. But if one ever starts pondering use of the widechars, when one needs to actually peek at the data, then there is simply no point using the UTF-16. And UTF-8 has disadvantages whne .

For external conversions, all what matters that the internal format can be easily converted into the widely used encodings. Application doesn't have any direct control over it - it is user controlled. User might pick UTF-8. Or JIS. Or win-1257. And application has to make sure that when it spews the data to outside, they come out in the encoding requested by the user.

Naive notion of that utf-8 is used by everybody is extremely naive. And IMO it is rooted in the same arrogance which held back the *nix world for decades in the dark ages of the 7-bit ASCII.

Comment Re:utf-32/ucs-4 (Score 1) 165

by ThePhilips on Sunday January 11, 2015 @04:41PM (#48788849) Attached to: NetHack Development Team Polls Community For Advice On Unicode

So what you propose?

Go with utf-8 which doesn't alleviate any of the problems? But adds its own one?

Beside, I doubt very much that anybody is going to use any of the fancy characters in the NetHack.

Comment Re:utf-32/ucs-4 (Score 1) 165

by ThePhilips on Sunday January 11, 2015 @04:33PM (#48788813) Attached to: NetHack Development Team Polls Community For Advice On Unicode

Characters in Thai are rendered in display-oredr, and not logical order. [...]

Ha! Not relevant to me, actually. But very informative. Thanks.

Overall, most customers are aware of the problems (and in my experience better than me). Simple handling I had in my software had worked and was sufficient.

The Thai language specifically is a cool example. Why not relevant? My company refused to do Thai localization. (And thanks to you now I know fully why.) To do the localization we were told that we have to buy a special Thai language library. The library costs huge money. When we told customer that they would have to pay for it, they have refused and canceled the project, because for them it was too too expensive.

Comment Re:utf-32/ucs-4 (Score 0) 165

by ThePhilips on Sunday January 11, 2015 @04:04PM (#48788647) Attached to: NetHack Development Team Polls Community For Advice On Unicode

So basically what you (and others) are saying, is that since there are some edge cases foreseen in the standard, nobody should try to make life easier even by a bit?

Combining characters (and the rest of the crap) pretty much never occur in real life. Only in some sadistic test case for the Unicode libraries, probably.

The main purpose of Unicode, why both users and developers want it, is to represent as much as characters as possible with least hustle possible. And that's pretty much what everybody's shooting for.

Comment Re:utf-32/ucs-4 (Score 1) 165

by ThePhilips on Sunday January 11, 2015 @02:47PM (#48788125) Attached to: NetHack Development Team Polls Community For Advice On Unicode

Its obvious you have little real experience with unicode, because saying 'just convert to utf-32' just papers over the problems without solving them.

Indeed I've only scratched surface. And that alone gave me headaches for months.

UTF-32 units are code points, not characters, and there are many multi-code-point (variable length) characters in utf-32.

For example?

Comment Re:utf-32/ucs-4 (Score 1) 165

by ThePhilips on Sunday January 11, 2015 @02:41PM (#48788097) Attached to: NetHack Development Team Polls Community For Advice On Unicode

It is the same problem as with the fancy acute/agrave/etc special symbols.

And the special white-space/no-space characters. And the special writing direction change characters.

They are generally removed during normalization/conversion into canonical presentation.

The thing is, after the normalization, which is needed for any Unicode text anyway, UCS-4 becomes a plain array of characters. But UTF-8 - still not.

Comment Re:utf-32/ucs-4 (Score 2) 165

by ThePhilips on Sunday January 11, 2015 @01:24PM (#48787563) Attached to: NetHack Development Team Polls Community For Advice On Unicode

i don't see a real argument here. "considering the length". how long is it?

Check the game history. Literally decades between major releases.

"some of the silliness". what silliness is this exactly? external storage of utf-32 requires that one deal with an endian character set. every time any text is touched, you'll get to endian convert.

Everybody has already settled on the little-endian presentation.

isn't that awesome? utf-8 does not have this issue. and one can almost always treat utf8 as a byte stream. except in the rare case where one needs to know where character boundaries are. for example, to map the character to a font. the fast path is the common path (ascii), and just requires a single test ((c&0x80) == 0).

With UCS-4 you do not even need any tests.

Extracting a character - trivial.

Length of string - trivial.

Normalization - much simpler than the utf-8.

The sad reality that libraries I have seen actually implement the utf-8 handling by using internally utf-32. You can't avoid it: Unicode is specified in the code points, which as you point it out are already as good as 32 bit long.

sure the gnu c library has had bad wchar_t conversion routines in the past, but it's a free country. you can implement your own.

Frankly, I haven't even used C library for the purpose. We had already one library developed in-house, because portable support for utf-8 is patchy at best.

The sanest portable approach is to link with iconv and convert everything from some internal presentation to external. Because you can never know what encoding user needs. Unless you really need to save the RAM (one has shitload of string data), utf-8 simply sucks as internal presentation.

P.S. I have had very little experience with Unicode. But several month of dealing with it, have simply convinced me that if one has to deal with l10n/i10n, then utf-16/utf-32 are very good choices. Ditto, if one has to deal with the Unicode. If application really doesn't care what it prints or reads - then pass-through binary (utf-8) works too. But as soon as one has to take the length of utf-8 string (real length), then it is time to start switching from utf-8 to utf-32.

Comment utf-32/ucs-4 (Score 1) 165

by ThePhilips on Sunday January 11, 2015 @12:30PM (#48787231) Attached to: NetHack Development Team Polls Community For Advice On Unicode

Considering the length of their release cycle, seems to be a safe choice.

It's not like the difference 1/2/4 bytes would make much performance difference for the application like NetHack.

Using the utf-32 internally would save them from some of the silliness the alternatives like utf-8 bring with them.

Comment Re:Why not free education for life? (Score 1) 703

by ThePhilips on Friday January 09, 2015 @04:38AM (#48772973) Attached to: Obama Proposes 2 Years of Free Community College

Gee, money does grow on trees.

Since some money are actually made of actual paper, this is a factually correct statement!

Tax payers can afford to pay for lazy asses to never enter the work force by continuing education for a life time.

Two years is a life time?

Let me guess: you are from red neck state with life expectancy under 35?

And while we are at it, why not free cable?

One can succeed in life without the cable. But not without the education.

Or perhaps free condoms?

Actually some school and medical institutions already give away free condoms.

Slashdot Top Deals