JournalJournal: Correlation and Causation23

tepples wrote:

Correlation implies 25% likelihood of causation. Either A causes B, B causes A, C causes A and B, or chance.

In this post, Immerman wrote:

I *hate* seeing statistics abused. A 25% likelihood of causation is *not* implied. Yes, one of the four outcomes must be the case, but you don't know the relative probabilities of each. It's like grabbing a marble out of a bag containing red, green, blue, and yellow marbles - there's only four possibilities as to which color your marble is, but for all you know I filled the bag with blue marbles and just threw in a handful of the other colors, in which case it would be preposterous to claim a 25% chance of getting a red one.

I'm aware of the hyperbole in my illustration. They're probably not equally probable, but absent other evidence, one has to assume so. My point is that just because the probability isn't 100 percent doesn't mean it can always be treated as 0 percent. So if you want to plead false cause more effectively, explain why they're not equally probable. Be willing to discuss what further observations would be needed to show which of the four possibilities is most likely. But don't say "correlation does not imply causation" as if it were "correlation implies lack of causation" without providing evidence, as that's close to the fallacy fallacy and the black or white fallacy.

JournalJournal: Stop Being a Broken Record16

People tell me my Slashdot comments are repetitive. I'd appreciate some hints as to how to be less repetitive.

Sometimes it looks like I'm reminding other users of an unsolved problem, but the problem has in fact been solved. Perhaps the real problem is that the solution hasn't been well publicized. For example, one solution to a lot of problems with home entertainment is to put a PC in the living room, but almost nobody knows about this.

If it looks like I'm reminding other users of an edge case too often, consider that a solution that covers more edge cases will appear better thought out and more robust than a solution that covers only the common cases and leaves the edge cases unnoticed.

And sometimes I get confused as to which is the common case and which is the edge case. For example, h4rr4r has pointed out that whenever someone brings up Netflix as an alternative to cable television, I often bring up the fact that Netflix lacks sports. I try to phrase it like "Netflix is fine for people who aren't into sports", recognizing that both non-sports-fan and sports-fan markets exist but apparently putting undue emphasis on the sports-fan market. This goes back to discussions that I've had with heads of household in my survey sample. They tell me they don't see how Netflix would be worth an extra \$7.99 per month on top of what they already pay for TV. So I try to make room for Netflix in their budget by suggesting how much they could save by switching from cable Internet+cable TV or fiber Internet+satellite TV to their current Internet+Netflix, and then they mention sports. I guess the survey sample of households in my extended family with broadband access must be a biased sample with more sports fans than the general population, and thus I have a biased view of the relative size of the sports-fan and non-sports-fan markets.

JournalJournal: Other Public Options in the USA10

I've received at least three, now four replies to my current Slashdot signature:

USA already has other public options: public schools and USPS Priority Mail over private schools and UPS 3 Day Select.

My signature points out that the United States has a history of public and private sectors competing in a given sector. For example, an engineering student in Indiana can go to Rose-Hulman Institute of Technology (private), or she can go to Purdue University (public). An online hobby store can ship packages to customers with United Parcel Service (private), or it can ship them with United States Postal Service (semipublic, funded by an exclusive contract with the US Government for mailing letters).

As of the third quarter of 2009, health insurance for United States residents under age 65 is mostly provided by employers, who make insurance available to their employees. But not all employers are large enough to qualify for group insurance plans, and some employers even restrict employees to part-time hours so that they don't have to offer coverage. Some insurers offer individual plans, but these are known for refusing to cover people with any of several sorts of preexisting conditions. Estimates of the number of documented U.S. residents without health insurance range from 8 million to 47 million.

The legislature of the United States, called the Congress, has recognized that the lack of universal coverage is holding America back compared to other highly-developed countries. Its members have been debating whether to form a public health insurer to compete with private insurers; this hypothetical insurer has been nicknamed "Public Option" or "Obamacare" in the news media. Some more fiscally conservative members of the Congress argue that any public option would distort the market, and people would leave their current plans and end up on Obamacare. Yes, some people will switch from their current insurer to Obamacare, but that's to be expected: people switched from UPS 3-Day Select when USPS introduced flat-rate shipping boxes.

And the so-called "death panel" is actually called iMac.

JournalJournal: Devil's Advocacy: ISP Throttles Non-HTTP Connections to 33%4

Discussion forked from here:

the point is the plaintiff has to prove that you HAVE copied their work, not that you have to prove it is entirely original. The comment regarding therefore no worry is if you have NOT copied someone else work (for instance with a home video of you children, unless the plaintiff is a stalker) is to do with the side of the burden of proof.

The elements of copying are access and similarity. The plaintiff shows some similarity between the works. Then the plaintiff shows that the defendant should reasonably have had access to the work because the work was on the pop charts. This creates a rebuttable presumption of copying. My question: how would one rebut this presumption?

The large proportion of FOS developers feel it actually anathema to their whole project to charge even a nominal fee for their work.

CheapBytes distributes copies of free operating systems for a fee.

Firstly, the majority of large programs offered for download the company ask to be downloaded either from a FTP mirror or via bit torrent as it doesn't suck the entire bandwidth from their webhosting, slowing the website (which is what the HTTP Protocol is for).

What's the difference between an FTP mirror and an HTTP mirror in this case?

Windows Updates use SUP not HTTP

Google failed me on SUP, but it found Background Intelligent Transfer Service. That uses only 20 percent of bandwidth anyway, and the article is about throttling to 33 percent (or, alternatively, letting HTTP burst to 300 percent).

There are other encoders though that ARE FREE (and Open Source) - ffmpeg is a free encoder much like XVid, and unlike what you seem to think, does not break patents.

Any encoder for MPEG-4 Part 2 violates U.S. patents if not licensed by MPEG-LA, and as I understand it, MPEG-LA's standard license terms are incompatible with the four freedoms that define free software.

Are you just trying to dictate to EVERYBODY ELSE (your customers or otherwise) how you demand the internet to be used?

Yes, the ISP is trying to do so.

you also obviously have never played an online game.

I have played at least three Nintendo DS games over Nintendo Wi-Fi Connection. Animal Crossing: Wild World copies the map from the server to any client that joins, but that's only 88 KB of data.

JournalJournal: Nintendo. Wheee.7

In this discussion, godrik and I were discussing the relative merits of web applications that use AJAX techniques compared to local applications. I brought up the advantage that web apps can run even on machines where the user isn't allowed to install new software, such as someone else's PC, a set-top web terminal, or a video game console.

Godrik countered that he'd never buy a machine that didn't let its owner install software, and that when he wanted a console to play games on, he bought a Wii and jailbroke it using Bannerbomb. He mentioned plenty of established PC titles that have been ported to libogc, the library used by Wii homebrew: source ports of Id Software's Doom and Quake, emulators such as FCE Ultra, ScummVM, and VisualBoyAdvance, and various Linux-original games that had been ported to about everything, such as SuperTux. These games had presumably recouped their costs of production entirely on the PC.

In general, there are four routes to being able to run code on a closed platform:

1. Make a web application that runs in the console's web browser. These browsers are usually severely limited in performance and in how much of the system's capability the browser exposes through the DOM. Some can't even read more than one gamepad at once, and they're impractical for playing handheld games away from Wi-Fi coverage.
2. Make a pay-per-download game and sell it through the console maker's online store. This is cost prohibitive due to various artificial overheads imposed by console makers such as Nintendo, such as the requirements of a separate office and a prior commercial title on another platform.
3. Make a native game that ships on a retail game disc. This is even more cost prohibitive than download.
4. Make a "homebrew" game that relies on a jailbreak. This is the solution that godrik appears to prefer, but it has problems.

First, jailbreaks break the console's warranty or worse. There are anecdotal reports that Nintendo charges more for out-of-warranty service, such as disc drive replacement after the first 12 months, if a jailbreak is detected than if not.

Second, Nintendo can break Bannerbomb at any time by fixing the defect in a new version of the Wii Menu and IOS. Nintendo would install the fix on newly manufactured consoles and require an update before people can connect to Wii Shop Channel (workaround: WiiSCU) or start newly manufactured Game Discs normally (workaround: Gecko 1.8+). It could take weeks for a new sploit to be developed and released on sites such as WiiBrew, just as it took weeks from Wii Menu 4.0 to Bannerbomb.

But finally, the homebrew community frowns on charging for anything, especially the jailbreaks (Twilight Hack, Bannerbomb) and the launchers (BootMii, Homebrew Channel). That doesn't look good for somebody who wants to feed his family but isn't rich enough to afford the overhead of a license to develop on a closed platform, or even someone who just wants a little economic incentive not to abandon his projects.

One could develop for an open platform such as the PC, but as I mentioned in my last journal entry, not all genres fit well on such a personal computer. For example, a developer might want to make a social game designed to be played with gamepads, a big screen, a sofa, and three friends, such as Nintendo's Mario Party series or Super Smash Bros. series. But four adults can't easily fit around a PC's comparatively small monitor, and a lot of PC gamers seem to be keyboard-and-mouse fanboys who would make other players take turns if they're not old enough to work and buy their own PCs and their own copies of the game. One could go the "home theater PC" route, running gamepads through a USB hub and a VGA or DVI-to-HDMI cable to an HDTV, but two-thirds of U.S. households still have an SDTV in the living room, most PCs don't come with an S-Video output, and the PC to TV adapter isn't sold in stores. Likewise, music games with key sounds, such as Beatmania and Guitar Hero, can feel unresponsive on PC sound cards with their much higher audio latency.

But then, godrik wasn't referring to free as in free beer but instead to Free as in free speech. One way an author can rely on Free is to make the game a massively multiplayer online game based on subscriptions or micropayments. This has its drawbacks: more complexity, requirement for lag-tolerant game play design, cost of administering the game server, need for a separate PC per player, generally no opportunity for children to play due to COPPA and foreign counterparts, failure to reach people who regularly game away from a reliable Internet connection (such as laptop users or people living in the country), and the fact that a lot of people prefer to buy rather than effectively rent their games.

Another way is to make the game engine Free but to charge for the data files, much like Doom and Quake after their GPL release. But are there any success stories of shipping a retail or pay-per-download game whose engine is free software from day one?

JournalJournal: Indie HTPC Games: The Rationale2

In this comment, nuzak wrote:

If PC gaming is dying, HTPC gaming can revive it.

Considering the HTPC itself doesn't seem to be gaining much traction these past couple years, and consoles have been encroaching (albeit very slowly) on the HTPC space, I'm interested to hear what your view on the topic is.

There are two kinds of real-time multiplayer video game. Some games require one machine and screen for each player; these are historically associated with personal computers controlled by a keyboard and mouse, connected in either a local-area network or through the Internet. Other games allow multiple players to share a screen. Incidentally, this can be done without splitting the screen, as seen in Midway's Gauntlet, Konami's Bomberman series and Teenage Mutant Ninja Turtles (arcade), and Nintendo's Super Smash Bros. series. These traditionally run on arcade cabinets or on video game consoles with multiple gamepads. The historical reasons for this platform divide include the difficulty in connecting multiple gamepads and the difficulty in fitting four players' bodies around one 14- to 17-inch monitor.

But in the late 1990s, the line began to blur. At first, only consoles had hubs called "multitaps" to connect four gamepads to one machine, but starting with the popularization of USB in 1998, the PC has also had hubs that take multiple gamepads. In the early 2000s, more and more PCs have included composite and S-video outputs for a standard-definition television, and high-definintion televisions have included VGA-style video inputs, solving the screen problem. The rise of home theater PCs has led to demand for multiplayer games designed to fit an HTPC.

Yet even in 2008, this demand has not been met, and the stigma of one PC per player remains. A minority of PC titles, such as Serious Sam, Lego Star Wars, and Midway Arcade Treasures, allow two players on one screen, but not much more. Even cross-platform games whose console version works with more than one gamepad tend to need one PC per player. The landscape of HTPC gaming is so barren that some people have recommended loading up an HTPC with emulators to run unauthorized copies of console game ROMs.

Much innovation in software comes from microISVs, or small businesses that develop software and distribute it on the Internet. These are often home-based businesses and in some cases are run more as a hobby or moonlighting enterprise than as a profit-seeking day job. Some microISVs make their money by developing proprietary software, distributing a trial version at no charge, and selling copies of a version with more features. Others, especially developers of free software, just take donations and advertisements. But the console makers have consistently excluded microISVs from the market. For example, from Nintendo's developer qualifications for Wii and WiiWare: "In addition, an Authorized Developer will have a stable business organization with secure office facilities separate from a personal residence ( Home offices do not meet this requirement )".

Imagine that the head of a microISV has written a design document for a video game intended for two to six people in one room looking at one screen. His team has developed a playable prototype that runs on Windows. For which platform should he and the rest of his team develop and market the final version?

JournalJournal: Threading, Digressions, and Offtopic Moderations

In this comment, sethawoolley wrote:

if you don't like somebody's reply to an offtopic/hijacking/flamebait post, the best thing to do is to rate it "overrated", that way it doesn't go into moderation as an offtopic post, because, well, it was on the hijacked topic. That's the beauty of threading, isn't it -- topics can change.

Overrated simply means, relative to its current score, it's not something somebody browsing at what it's currently scored at would expect.

I "think" that's what the offtopic moderator wanted to say. Or they just got confused because my reply showed up underneath another topic such that the only way you can tell it's really a reply to a different topic was that there were double angle-lines that are easy to miss.

Tip 1: Be sure to quote the parts of the comment you're replying to. Quote multiple levels to recap the discussion from the original article to Slashdot's summary through parent comments if you feel it necessary.

Tip 2: If a comment is far enough off topic, and you can't tie it back to the article somehow, put it in your journal. Then, under the original comment, reply "See my journal" without bonus so that it at least shows up in the other user's messages.pl.

JournalJournal: Noddy and Mr. Miyamoto?

Tonight, Nintendo's new video game console enters the hands of the obsessive-compulsive Americans who have queued up for over 12 hours. I am not one of them. So to pass the time, I put "Wii" into Google image search. Twice. But one result disturbed me (see image): Why is Noddy with Shigeru Miyamoto?

Blyton will probably take me to court for this given an incident from 1999 where I compared Noddy (pics) to Pinocchio (article) on a web page and got a cease-and-desist.

JournalJournal: Bayesian Filtering: Is It Doomed?7

Bayesian text classification is a statistical method of determining the probability that a message is in a given category. It works by making a database of how often each word occurs in messages from a corpus that are or aren't in the category, looking up this probability for each word in a given new message, and then using Bayes' theorem on the probabilities to predict how likely the message is to be in the category.

When applied to the corpus "e-mail" and the category "unsolicited bulk e-mail", the method is called Bayesian spam filtering. For example, words such as "Viagra", "mortgage", "Rolex", "Nigeria", and the like are likely to occur in spam, but some other words are more likely not to occur in spam. Many e-mail service providers applied Bayesian filtering to their customers' incoming e-mail and moved likely spam into a separate folder. This worked ... for a while.

After several months, spammers discovered ingenious techniques to defeat filters. First they disguised the operative words by "creatively" spelling them, Some spammers just misspelled key words: "Ciallis", "mortagee". Others randomly replaced letters with near-homoglyphs from l33tsp34k or from foreign languages: "Viagra" became "Wla9ra" or "\/ 1 A G R @", or "porno" might use a Greek omicron or Cyrillic o or replace the 'p' with the Greek rho or Cyrillic er, both of which look like a Latin 'p'. Anti-spam filters eventually began to check for such techniques and flag them specifically.

Later, spammers attacked the method by using innocuous words in e-mail in order to fool the filter into thinking that a message is not spam. First they used random sequences of letters. Filters blocked words with too many consonants for the target language. Then they used random dictionary words. Filters blocked too many long words in a row. Then they used sentences from literature, as seen in so-called Gutenberg spam and Hobbit spam. These techniques are intended to increase the spam probability of innocuous words, introducing noise into the database and causing the filter to misclassify messages.

However, not all people have the same words marked as not-spam. For instance, people on a constructed language mailing list are more likely to have linguistic jargon marked as not-spam, while people on a video game mailing list may have video game terminology marked as not-spam. Thus, a spammer could collect addresses from a newsgroup, a public web board, a public mailing list, or the contact page of a public web site, and associate each address with words that appear on the same page as the address. How will Bayesian filters block this? Can it be blocked at all?

JournalJournal: Pie Lovers Wobble But They Don't Fall Down3

The old

The new

Weebles + Bob + 355/113 = Wobbl and Bob, now on DVD.

JournalJournal: Yes, Copyright Infringement Is Theft.3

At least in Indiana.

In the United States, federal law defines "copyright infringement" in Title 17, United States Code, and state law defines "theft". For example, in the State of Indiana, Indiana Code 35-43-4 defines the crime of "theft" as "knowingly or intentionally exert[ing] unauthorized control over property of another person, with intent to deprive the other person of any part of its value or use". In turn:

a person's control over property of another person is "unauthorized" if it is exerted: [...] by transferring or reproducing:

• (A) recorded sounds; or
• (B) a live performance;

without consent of the owner of the master recording or the live performance, with intent to distribute the reproductions for a profit."

So yes, even pedants should recognize that some copyright infringements are considered theft. If you can come up with analogous laws in other U.S. states, please post the details in comments.

That does it.

People often discuss several clique web sites that require some sort of invitation before essential parts of the site become available. I'm not a Freemason; I'm not big on secret societies. I try to ignore those sites to the extent that I can because I don't want to jump in head-first without testing the waters.

In my spare time, I maintain free software for PC and Game Boy Advance and am in the middle of writing an ambitious GBA programming tutorial. However, I'm not entirely sure that the projects I maintain have a high enough profile in the general interest community to attract the "certifications" that allow me to write anywhere but inside my own profile page. Free software advocates seem to prefer to certify people who design their software to run natively on popular free software operating systems that run on PC hardware. However, though I do make an effort to use cross-platform toolkits, I currently do not and cannot test my PC software on any platform but Microsoft Windows. I can't just switch to GNU/Linux because it has no drivers for peripherals that I own, and I cannot afford to purchase new compatible peripherals. I can't just dual-boot because I have processes that don't like to be started and stopped every hour with downtime. Therefore, it appears I'm not the model free software developer that Advogato is shooting for.

MetaFilter

MetaFilter doesn't accept new users because it wants the community to remain small, that is, not much over 17,000 members. The administrator discovered that not only does the MeFi system eat copious amounts of valuable traffic and computing resources, but also the MeFi format itself doesn't scale past that many members for at least two reasons: things would drop off a reasonably-sized front page too quickly, and it would take too much labor to clean up inevitable dupes. It appears that the administrator wants erroneous information to persist uncorrected on comment pages and wants prospective new users to migrate to competing sites such as MonkeyFilter. Likewise, people who found Kuro5hin locked-down for several months were driven to Hulver's site instead.

Orkut

This is the biggie. Orkut is a purportedly popular by-invitation-only social networking web site. From what I've gathered in comments to this Slashdot story, Orkut is just a big bulletin board, not much better than a Yahoo! Group and much slower and less stable. Second, it's said to be full of Brazilians who refuse to use English in communities designated as English-speaking. Third, be prepared to delete Portuguese spam from your internal private message mailbox. Finally, for all I can tell, it might not even exist; it could just be an elaborate hoax, as broad and deep from the outside as EA's old Majestic immersive game.

Oh, and the name "Orkut" means something not safe for work in Finnish.

Gmail

Other than the increased storage space, is there really anything significant that Gmail provides that other popular web mail doesn't? Does it warrant switching e-mail providers from SpamCop?

Here's an invite code. Or here's a site that doesn't need an invite code. Just try the site, and if you can't get the hang of it, quit.

I value my time. Between participating in online communities [S] [G] [D] [B] [N], exercising at a local gym, writing free software, and babysitting, I feel that I may already be spreading myself too thinly. In fact, I have had to become nearly inactive in several communities [K] [U] [P] [T] [R], to the point where some administrators have even deleted my account one or more times.

Perhaps when somebody decides that one of these communities wants me, by sending me a well-reasoned explanation of what I could get out of a membership, such as job leads in northeast Indiana, then I'll decide that I want the community. If you wish to contact me privately, feel free to do so.

JournalJournal: How the Drinking Age Cements the Record Cartel6

The Constitution for the United States of America is the supreme written law of the United States. It lays out a set of powers for an elected legislature called the Congress, reserving power over everything else to the several states (50 at last count). The Congress regulates commerce across state lines, but each state regulates commerce within its borders. This would seem to allow each state to set its own drinking age.

However, the Constitution has more to say: "The Congress shall have power ... To establish post offices and post roads". Nobody would want young intoxicated drivers on the highways, running the risk of colliding with postal trucks. Thus, courts have interpreted this grant of power as letting the Congress dictate the conditions under which states can qualify for federal funds for improving their highways.

Each state has power to set its own minimum age to purchase and consume "drinks" (beverages containing ethanol), but the Congress will not grant highway funds to states whose drinking age is less than 21 years. To make it easier to enforce this law, states have established separate licensing for establishments that serve food: "restaurants" admit minors, and "bars" don't. States also limit the amount of drinks that restaurants can serve.

A "rock band" is a group of people who routinely perform live rock music together in front of an audience. A rock band can choose to perform in any of several venues: a recording studio, a stadium, a theatre, a restaurant, or a bar. Problem is that many people won't spend money on a record they've never heard, and radio stations charge an exorbitant "independent promotion" fee to get a record played. Stadiums and theatres also charge an exorbitant venue fee, which many local rock bands cannot afford. This leaves restaurants and bars, and very few restaurants find it profitable to let rock bands perform on their premises.

Local rock bands also have trouble getting their records heard on the radio.

Therefore, minors have nowhere to turn to see a local rock band perform. A captive audience of teenage listeners is exactly what the largest publishers of recorded music (hereinafter "major labels") want, as they find it easier to cultivate a Britney Spears or *NSYNC than to find real musical talent. Instead of buying records at shows, they buy what they've heard on the radio, which the major labels control, or what they've seen in stadiums and theaters, which the major labels also control.

JournalJournal: Five Blockers to Linux21

Conventional wisdom holds that at least the following five problems block the adoption of Free operating environments such as GNU/Linux on home computers. What steps have GNU/Linux advocates begun to take in order to fix these?

1. The only consistency among graphical applications for GNU/Linux is that they consistently ignore the GUIdelines of their desktop environment.
2. Best Buy carries no peripherals with a penguin on the front of the box. A penguin would indicate that the IHV has chosen to include working Linux drivers on the disc bundled with the hardware. "Print out your distribution's hardware compatibility list and carry it into the store" does not easily apply to gifts from relatives.
3. Best Buy carries virtually no recent release proprietary 3D games designed for GNU/Linux, other than those few M-rated first-person shooters that include a Linux client binary on the CD alongside the Windows binary. Parents may find M-rated games unacceptable, or players may prefer MMORPGs or tactical simulations.
4. Best Buy carries no recent release proprietary educational games designed for GNU/Linux. People buy computers to run Reader Rabbit.
5. GNU/Linux lacks a DVD Video player application licensed by DVD Forum and DVD CCA.

