RyoShin - Slashdot User

Journal Journal: How to StumbleUpon StumbleOver and StumbleOn

Journal by RyoShin on Monday September 17, 2007 @01:54AM

Like many on the internet, including other /. members, I am a user of StumbleUpon. For those who don't know what StumbleUpon is, the short and simple is that it is a gateway to the internet at large. It's great for lazy afternoons when you just want to find new webpages. You set some preferences, some content filters, and boom, you're off. It covers topics from architecture to zoology, and most everything in between. A great way to read new and interesting scientific discoveries or watch sleeping cats fall off of whatever shelf they happen to be sleeping on. Or both, if that's your thing. But not at the same time.

However, StumbleUpon reveals one of the larger annoyances of the internet: data redundancy. Site after site, blog after blog will host the same content (usually video or pictures, but it can even be word-for-word text), meaning that you'll wind up Stumbling Upon it time and again- and it really gets grating after you see the eighteenth LOLcat collection. To my knowledge, SU has no way to deal with this. You can rate things up or down and perhaps have less of a chance of seeing them, but that's not always the case.

To this end, I feel that StumbleUpon would do well to introduce two new features: StumbleOn and StumbleOver. Both features would be user preferences. You could choose to StumbleOn, StumbleOver, both, or neither (seeing the internet in a pure, unadultered form).

StumbleOn is a feature that would reference all citing pages to the main page or site that the citing pages talk about. This is the harder of the two features to implement. The idea is that instead of stumbling upon a page that is either a rehash or just a quick blog entry about another page (usually done for ad hits), you would instead be redirected to (or On) the original page. Slashdot will see things like this- a summary for an article will contain a link to a blog that contains a link to the actual article. StumbleOn would cut out the blog entry, giving focus where it is rightly due: the original authors.

As stated, this is harder to do. Some things see circulation for so long that pinpointing the "original" is tedious (assuming it still exists). Then there are sites that jump up simultaneously, such as the smattering of lolcat sites that appeared within a few days/weeks of each other. Content can give some help. For instance, if a blog entry directly links to the original, you know you can StumbleOn to that original. Perhaps the video being shown lists a URL to use; failing that, you could StumbleOn to where it's hosted on Youtube/MediaCafe/whatever.

Part of the problem here is ballot stuffing. Someone might get a bunch of friends/paid hacks to all say that that person's site is the "original", though it would clearly be just a lame blog entry for ad hits. But, as with most systems like this, it can be overcome with other user adjustments. Then there's the risk that a blog entry that is actually useful, like dissecting a video or giving further insights, gets marked as StumbleOn. A second level might be introduced for these, but that would start making this very complex.

StumbleOver is likely easier to implement, and, in my opinion, far more useful of the two. In the case of StumbleOver, you don't care what the original site is. You only know that you've seen it before and, even if you liked it, don't want to see it again from another site. Whereas StumbleOn would be pictorially represented as a tree, with one main site (the "root" site) being lead to from many others, StumbleOver would be seen as a nice, round circle. By seeing one part of the circle you've seen them all, so you don't need to see them again. This would lead to a lot less repetitiveness in your stumbles.

However, this is not without it's own problems- how specific should content be measured? Most would agree that a word-for-word copy, a single image or set group of images, or a video would all be easy to StumbleOver. But what about a blog entry that restates the original text in the user's own words? Is one lolcat page with 10 images the same as another with 15? (This case can be kind of solved with StumbleOn, using something like icanhaschezburger as the main source) What if someone has a higher quality version of another's video (quite unlikely, but possible)?

These aren't perfect ideas, and I have no idea how to submit them to StumbleUpon, but I think they would make great strides in making StumbleUpon a better product and the internet easier to browse.

Journal Journal: Facebook Phone Number Folly 2

Journal by RyoShin on Tuesday August 07, 2007 @10:12PM

I, along with most of the Slashdot community, know much about social networking sites. I, probably unlike much of Slashdot, am a member of a few. One of these sites, Facebook, came under fire about a year ago for their News Feed feature, which allowed users to see updates made by their friends in one convenient form. This resulted in a massive and seemingly unexpected backlash by the Facebook crowd, which caused Facebook to lock it down only a few days later.

So users of Facebook are not ignorant of the privacy hazards that sharing information like that can lead to. However, it seems that some haven't learned their lesson. Through my own News Feed, I learned that one of my friends had recently joined a group. However, the group had a very odd title, almost like it was a personal journal entry. Curiosity got the best of me, and I clicked through to find out.

To my utter surprise and slight discomfort, I found that it was a group set up for someone that lost their phone. He had set up the phone for the express purpose of retrieving the phone numbers he had lost in the old one. This can seem like a mis-guided attempt with only one example, as doing this may be an easy way to notify all of your friends. Facebook does allow for closed groups- close the group, and only the friends you've invited can see your new phone number or post their own. Perhaps this poor fellow merely misunderstood how the process worked.

With this in mind, I decided to plug "phone lost" into Facebook's search engine. The result is so many groups that it stops counting at 500. Yet not all is lost; once again, this could be a matter of convenience, and other users had closed their group. I decided the best way to test this theory was to do a sample of the first three pages of results and compile some (simple) stats. (Note that all numbers are assumed unique, which may skew the results in favor of panic.)

Total Groups: 27 (Facebook returned the same group a few times)
Open Groups: 81%
Total Members: 523
Phone Numbers (with Area Code): 184
Phone Numbers (no area code): 14

Percentage of Group Members with posted numbers: 37%

Average Membership per Group: 19
Potential Amount of Numbers available (with 500 groups): 3515

Sadly, I was very wrong. The number of users willing to post their phone numbers in an open location such as that is worrisome. While Facebook does have the option to enter your number in your profile, it can be restricted only to friends. Furthermore, by default profiles are locked to friends-only. The combination of these two elements may have set a false sense of privacy within the users who did post their numbers.

A few users had the fore-thought to at least withhold their area code. Even so, Facebook provides their primary network (area, college, or high school), which could be used to figure out the area code in relatively short time. One group owner did ask for numbers to be e-mailed rather than posted, citing the desire not to broadcast them to Facebook. He was ignored by nine people.

While I hate the "Protect the Children" argument, I believe it has some merit in this case, and extends beyond that. These numbers are readily available for anyone on Facebook to use for their own malicious pleasure. Even if all they can do is leave psychotic voice messages at odd hours, it can still be enough to emotionally scar a person, as happened to another friend of mine earlier this year.

Still, it is no surprise that many in this generation, especially high schooler students, don't understand the potential ramifications for posting such personal information online. I do plan to contact Facebook and ask them to attempt to send out a notice to these users or Facebook in general, but even that may go ignored.

Journal Journal: Practicing Practices

Journal by RyoShin on Friday January 26, 2007 @01:27PM

So much for that idea.

Working as a lowly intern for an internal programming department of a Fortune 500 company, it's amazing the quality of the code I read. Despite being a department just for one of the local facilities, you would think big money == big talent, right?

The apparent answer is no. I am in charge of maintaining over four dozen small internal web applications, written mainly in ASP and Coldfusion. (Not even .NET and MX - ug.) I've read through and fixed up code done by a dozen other "programmers", some of them interns such as myself, some of them full-time "specialists", and rarely do I look at a page and not think "WTF?".

Part of the problem could be the "rigorous standards" put in place here- and by that, I mean there are none. Very few of the applications are used by more than 20 people in the entire building, so the general process of new program creation goes like this:

Program request goes to manager
Manager approves/denies program
Program is assigned to one of the available programmers
Programmer works as quickly as possible to finish project
Project is tested for approx. two hours
Manager makes sure that project looks good to requestor
Project is booted out door, and any bugs are fixed as they come up

Since none of the programs are large scale (even the few used by more than 20 people), this doesn't work too bad, though it doesn't have anything useful like code review.

The other problem, one more glaring even in those programs that did have a better quality control (such as those where the programmer took the time to write out a scope and get it approved), is the large absense of proper programming practices. Repeated If-Thens where Switch-Cases should be used, code copied and pasted instead of put into a function/method, the same header code repeated on every page instead of put in a file to include, horrible naming schemes, bad use of whitespace, etc. Granted, programming styles will vary from person to person, but some of the things done within these are ludicrous.

Thanks to sites like TheDailyWTF (an excellent time waster, which is also beneficial for programmers to see what not to do), I believe that this is not a local problem, but one that affects many of those who get into this because they're looking for big bucks, especially when they start using languages like Coldfusion and Visual Basic (easy to write, and therefore easy to mess up). In my courses as a Computer Science major, I have yet to see anything that deals with proper programming practices. I realize that Computer Science is intended to go beyond programming itself, but even in the classes dedicated to programming it is not touched on much.

I would almost say that an entire course could be devoted to it, but I think that would be too much time. The various practices I'm thinking of are fairly simple; a week or two at most would be needed to go over them and make sure people understand them. Potential points would include:

Whitespace indentation
Descriptive Naming practices (I prefer lowerCamelCase, myself)
Programming for efficiency (redundant IF checks, using SWITCH-CASE instead of IF, proper loops, code reuseability)
Function creation (as well as some talk about recursion)
Truth logic (using such things as truth tables)
Database setups (my college actually has an entire class for Databases, but this would be useful for those who aren't CS majors)

I'm sure others have more things that should be added to the list (feel free to comment), but if colleges would put heavier emphasis on covering these kinds of things, maintaining programs would be easier for the rest of us.

Journal Journal: The Road to Inlightenment is Paved with Gummy Bears

Journal by RyoShin on Monday January 22, 2007 @08:11PM

Cause they're tasty.

I've decided to use my Slashdot journal as a sort of "blog". Whereas I have a LiveJournal to rant about my personal life and day, this "blog" will deal more with issues that affect everyone, and not necessarily only topics that Slashdot as a site is concerned with (but I still get to rant).

Some entries will be long, some will be short, some will have no point. Regardless, this blog will be open to the public and comments will be on, though I will never "Publicize" any entry, unless I find it relevant to Slashdot somehow, as well as being well written and containing references. This will be one of those "choose two" things.

Ideally, I update every weekday. Gives me a good side-thing to do at work when I get bored. If I do it at work, I doubt I'll have much in the way of references- internet use is fairly restricted. If I save it and complete it at home, then I can include helpful links.

So, if you ever visit my profile, get ready for more stuff to ignore.

Here's looking to tomorrow.

Slashdot Top Deals