Catch up on stories from the past week (and beyond) at the Slashdot story archive

 



Forgot your password?
typodupeerror
×

Startup Webaroo to put the 'Web on a Hard Drive'? 340

An anonymous reader writes "A new startup called Webaroo is launching Monday with an audacious proposition: You can search the Web without a net connection of any kind. Initial release consists of 'Web packs' on specific topics such as news, city guides or Wikipedia. Later this year they're promising a full-Web version that you can carry on a laptop -- provided you're willing to devote something in the neighborhood of 80 gig."
This discussion has been archived. No new comments can be posted.

Startup Webaroo to put the 'Web on a Hard Drive'?

Comments Filter:
  • by sentientbeing ( 688713 ) on Sunday April 09, 2006 @01:50PM (#15095310)
    I'm sold. Does anyone have the .torrent for it?
  • Dotcom v3.0 (Score:5, Insightful)

    by Saven Marek ( 739395 ) on Sunday April 09, 2006 @01:50PM (#15095311)
    A new startup called Webaroo is launching Monday with an audacious proposition: You can search the Web without a net connection of any kind.

    If anyone doubted the next dotcom boom is upon us, this should put that doubt to rest.
    • Re:Dotcom v3.0 (Score:5, Insightful)

      by caffeinemessiah ( 918089 ) on Sunday April 09, 2006 @01:58PM (#15095355) Journal
      I was JUST thinking that. This seems like the beginning of a whole slew of semi-ridiculous ideas that get funded because their proponents seem 'ahead of their time'. Did someone at a funding company not think of the following two points:

      1) the web is growing at a phenomenal rate. in a few years, the only thing that you'll be able to fit on even high-density media is very narrow, specific content. is there really such a huge market for that?

      2) wifi is nearly ubiquitous. why pay for a static snapshot of the web that will be obsolete in a few days when you can walk into a starbucks with you laptop and get the fresh stuff almost for free??

      I'm sure the guys who want to put the web on a disk have thought these points through, but me...I just really want to sigh. and buy some short-term stocks.

    • by Milton Waddams ( 739213 ) on Sunday April 09, 2006 @02:29PM (#15095493)
      Wait. There have already been 2 dotcom booms? I know there was one in the mid to late 90s. When was the other one?

      Plus, I hope you're right. I'm starting my graduate IT job in July. I'm gonna start earning loads of money! :)
    • by gihan_ripper ( 785510 ) on Sunday April 09, 2006 @02:53PM (#15095578) Homepage

      Even if this is doable and legal, it runs entirely counter to the spirit of the Internet. The Internet on a hard disk is no longer a network, it becomes a passive entity with no possibility of interaction.

      At the moment, we are seeing a return to the interactive origins of the Internet, prime examples being blogging, Wikipedia, and even Slashdot! If this projects takes off it will be harmful to interaction and will turn the Net into a glorified television.

      However, I find it unlikely that Webaroo will gain currency, precisely because we have become dependent on an interactive and living Internet. When I use the Net, I want to be able to read and respond to my emails, to check my bank balance, shop online, and read the latest news. Why on earth would I want to have a static Internet on my laptop?

    • Re:Dotcom v3.0 (Score:4, Interesting)

      by Philocke Fox ( 762396 ) on Sunday April 09, 2006 @03:26PM (#15095686)
      Robert X. Cringley had an article about this last year. http://www.pbs.org/cringely/pulpit/pulpit20050210. html [pbs.org]

      Basically what he said was that venture capitalists raised a whole bunch of money that they didn't spend during the last boom. This money is raised from investors and is given to the VCs for a limited time. The VCs make money from the management fees they collect for dealing with this money, usually 1 or 2% of the total amount. But, if they don't invest, then the money AND the fees get sent back to the original investors.

      The time limit on investment is usually about 10 years. So if we say that the boom started around '96, then some of these limits have already expired, and the rest of them will expire within the next 4 years.

      Use it or lose it. And the VCs will definitely use it.

    • by EmbeddedJanitor ( 597831 ) on Sunday April 09, 2006 @04:17PM (#15095852)
      There has been a large rise in start ups with hyped ideas (well at least if /. is your regular newsfeed) that starts to look a bit like another dot.bomb.

      Dot bombs are not about technically feasible ideas. They are not even about technology. They are all about putting together something that will appeal to venture capitalists. What really drove dot.bomb was that the VCs got into a feeding frenzy and all rational business plan/idea vetting went out of the window. For that to happen again means that a whole lot of people that got badly burnt, or that know someone that got badly burnt, must forget their bad experiences and get stupid and greedy again.

      The last dot.bomb had a fundamentally solid foundation: widescale adoption of internet. It was all the frilly bits that really were overhyped and caused the bomb. In the new wave, we seem to be seeing all the frilly bits and no solid core. Unless there's a solid core I expect the wave will implode long before things can get to the feeding frenzy stage.

  • by liliafan ( 454080 ) * on Sunday April 09, 2006 @01:50PM (#15095313) Homepage
    After reading the article, it sounds like they are just selling their web cache, nice idea but really unless they are selling really cheap I just can't see it picking up, especially considering the difficulties of getting the data to your drive, I mean an 80G download!

    Additionally what if I decide to follow site links that leave the cache?

    Yeah I can't really see this picking up.
    • Could just work... (Score:3, Interesting)

      by ELProphet ( 909179 ) <davidsouther@gmail.com> on Sunday April 09, 2006 @02:37PM (#15095538) Homepage
      But not in the way they think. TFA mentions two points, but doesn't explore them in depth. The first is their algorithms they use; let's face it, Google is starting to fall to the SEOs. If they have a new algorithm that was able to actually follow your web browsing all the way, they'd be able to provide much better results. Google claims to do this, but they can't follow you more than your first link. Second, they seem to pick up that most people find their entire information on the second or think link they visit.

      Combine these together, and the program could offer you 80 gigs of data to just sit on your computer and be sifted through at yuor leisure. It would be able to follow you through, and find exactly how you get through your data. When it needs to, it can spider into areas that it might think you'd want to go (Been looking at a lok of Wikipedia? Next time you connect, it goes an picks up some wikibooks).

      The best part, is that all the "Big Brother" information is being stored on YOUR computer, not their servers. You want that info, Bush? You'll have to supoena every user.

      If they tergeted this more towards a desktop-search type thing with better search algos than Google, this could just work.
    • Well, for an additional fee you'll be able to get a "Webaroo Subscription", which will allow you to connect to the internet and download additional content. I'm sure that this, combined with an optional subscription for real-time content-updates will make this product a smashing success.
    • by twitter ( 104583 ) on Sunday April 09, 2006 @02:37PM (#15095542) Homepage Journal
      The wayback machine's [archive.org] terrabytes [archive.org] of data is what this really takes. Keeping it up to date is another story.

      Archives are good and this can be a useful service. Providing 80 select gigs on a hard drive to libraries and schools is a useful until US networks get where they should be. Their software can keep those 80 GB up to snuff at night. When you leave the cache, you ... gasp ... get the new content. In the mean time, things are much faster when it matters. Mirrored content will always be a good idea. Look at the debian distribution system, for example.

      Good luck to the people at Webaroo. So long as they don't apply for stupid patents that give them an exclusive franchise to distribution systems, they are AOK.

      The road warrior thing will flop, though. People are going to stay where there's a network or pay the $10. It's the one piece of live information that requires the hook up. The speed of the rest is gravy for those people.

    • by dogwelder99 ( 896835 ) on Sunday April 09, 2006 @03:19PM (#15095661)
      From TFA...
      The company and service officially emerge from behind their stealth shield tomorrow armed with a flashy bundling agreement from laptop maker Acer.
      Most likely, the reason behind this awesomely silly "feature" is getting people to pay more for laptops with larger hard drives, with marketing promising "search the web without an internet connection!"

      And, of course, selling a subscription service that lets you download updates of your favorite internet content to your laptop... a technology formerly known as, well, "browsing the web". Using slick marketing to sell people stuff they already have, was a huge success for the bottled water industry... can't blame these guys for trying it on the internets.

  • by minus_273 ( 174041 ) <{aaaaa} {at} {SPAM.yahoo.com}> on Sunday April 09, 2006 @01:50PM (#15095315) Journal
    when someone asked if the internet will fit on a floppy?
  • by arthurpaliden ( 939626 ) on Sunday April 09, 2006 @01:51PM (#15095318)
    How soon till the first lawsuit is filed.
  • by php_krisp ( 858209 ) on Sunday April 09, 2006 @01:52PM (#15095320)
    Is this really the right to to try this? when wi-fi connections are popping up all over the place and the internet's bigger than it ever has been before?
    • by know1 ( 854868 ) on Sunday April 09, 2006 @02:03PM (#15095374)
      it would always be the right time, if only for the possibility that the bomb drops and we have to live a mad max style existance scavenging and fighting over laptop batteries and petrol in old stores throughout the land. If that happens, i wanna be able to read uncyclopedia [uncyclopedia.org] at the end of the day.
      If it didn't happen i would be like the guy who loses his glasses in that old story and can't read even though he has eternity "but there was time now..." or whatever.
    • Not just access (Score:3, Interesting)

      by David Hume ( 200499 ) on Sunday April 09, 2006 @02:12PM (#15095421) Homepage
      FTFA:
      Which isn't to say that ever more ubiquitous 'Net connections won't pose a challenge to the Webaroo business model.

      "Long-term their opportunity may have more to do with [search] performance" than the offline capability itself, Enderle says.

      Husick tells me that performance benefit was reinforced for the company by a rousing reception their service received from Japanese mobile operators who he says were salivating over Webaroo as a means to siphon search traffic away from their increasingly crowded wireless broadband networks.

      Webaroo will also be touting the potential cost savings and convenience of its service.

      "Every hotel I go to wants to charge me $10 to $15 a night for Internet. Every airport wants to charge me another $10 to get connected," Husick says. "If I've got five minutes before I have to board my flight, do I want to spend that five minutes connecting or do I want to spend five minutes getting my search answer?"
      I still think this is a business scheme destined to fail. It may be a business plan that is designed to survive only long enough to cash out.
       
      I've got news for Husick. I'm a lawyer who have sets of Statutes, Court Rules and Local Rules behind his desk. I still look them up online to make sure I have the most recent version. I can't afford not to.
       
      Search performance? Rarely, if ever a problem.
       
      Siphon traffic away from "increasingly crowded broadband networks?" They make money from that traffic. They can't, if necessary, charge per data download? Tier the service by download bandwidth? Charge more? Build a better network?
       
      The first cell phone or wireless device that expects me pre-download some portion of the net, that portion being determined by somebody else, is the first one I can cross off my list.
       
      Save $5 or %10 at the airport by not connecting? What if I want to send or receive e-mail? Get the latest news, business or stock information? I'm AT AN AIRPORT, which implies I have some money, and in his context that I'm on business. I'm going to foregoe a net connection for $5 or $10? If my employer is that tight, I'm looking for another job anyway -- one that doesn't use Webaroos' services.

      This reminds me of software solutions to cramped hard drive spaces awhile back. On the fly file compression and expansion when data size was outstriping hard drive size for a short period of time. (Remember the file corruption.) Even though there was a market for those products, barely, everyone and his brother knew that market was going to go away Real Soon Now.
      • by birge ( 866103 ) on Sunday April 09, 2006 @03:03PM (#15095610) Homepage
        Like most lawyers I know, you seem to miss the possibility for more than one option to exist in the world. (That's why you guys make such great politicians.) But the world of engineering is about increasing the number of possibilities, quite unlike the zero sum game from which most lawyers skim off the top. It's quite reasonable to suppose that there are times when having a laptop that has both wireless connectivity AND a static snapshot of the more useful parts of the net would be fantastic. For example, maybe in the airport you connect and grab e-mail but once you get on the airplane it would still be nice to have last weeks snapshot of the internet available to you while on the flight. There will always be times when the internet is unavailable, either through technical problems or the realities of your location. Having another, albeit lesser, option is always nice and these guys are trying to provide that.
  • ownership (Score:3, Interesting)

    by xzvf ( 924443 ) on Sunday April 09, 2006 @01:52PM (#15095321)
    Wouldn't there be an issue here of selling another person's content? While everyone can view the content at will, copying that information to media and then reselling it, or even distributing it for free, would be an issue.
  • Copyright? (Score:2, Insightful)

    by MustardMan ( 52102 ) on Sunday April 09, 2006 @01:53PM (#15095323)
    Considering the fact that companies are suing google for putting the first paragraph of their news tidbits on google news, how long will it be before someone sues webaroo for copyright infringement? Whether the claim is valid or reasonable or not is a moot point - someone is gonna see this as infringement and call out their pack of rabid lawyers.
    • by ScrewMaster ( 602015 ) on Sunday April 09, 2006 @02:03PM (#15095375)
      Well, since this is a start up they're not going to have very deep pockets, so unless someone is truly disturbed about copyright infringement I doubt you'll see too much legal action right away. No money in it. And I would expect that if anyone did complain Webaroo would immediately remove the offending content from future versions: they'd be fools to do otherwise. However, if (by some amazing happenstance) this becomes popular and profitable, expect multiple packs of hungry, rabid lawyers to move in for the kill. Isn't it amazing how the patent and copyright systems work to advance the useful arts and sciences nowadays?
      • by vidarh ( 309115 ) <vidar@hokstad.com> on Sunday April 09, 2006 @04:54PM (#15096004) Homepage Journal
        Doesn't matter if they remove it. IANAL, but if they put a large number of peoples content (as opposed to small snippets which can be defensible) on a CD and distribute it without verifying either that the copyrightholder has granted a license for it to be used that way, or contacting the copyrightholder to get a license, it is a clear case of copyright infringement and there's no way they'd be able to get a judge to believe it wasn't willful.

        The combination of willful copyright infringement and a profit motive == mandatory fines and a high chance of prison.

        Unless these guys are very careful about not violating anyones copyright it only takes one party in a case like this to care enough, and they're bankrupt. Perhaps they are, but if so, the idea of the "internet on a harddisk" is far away from the reality of it (as if it wasn't anyway)

  • by joe 155 ( 937621 ) on Sunday April 09, 2006 @01:53PM (#15095327) Journal
    look at news without a net connection? Either this is going to be just the same as viewing pages offline after you've been on them (perhaps an automated web crawler which grabs pages whilst you have some up time) or you will be viewing very old news... It seems to be the former though, in which case your not really doing it "without a connection"... so why bother? this seems like a waste of space and time (an bandwidth), just look at what you want to when your plugged in rather than constantly getting information you may never need
  • by KenDodd ( 961972 ) on Sunday April 09, 2006 @01:55PM (#15095341) Homepage
    For example, where do we get the porn diffs?
  • by RobotRunAmok ( 595286 ) on Sunday April 09, 2006 @01:56PM (#15095346)
    Been around since the early 90's. Back then it was called "fan fiction."
  • by Idimmu Xul ( 204345 ) on Sunday April 09, 2006 @01:57PM (#15095351) Homepage Journal
    e.g. searching? Having Wikipedia on your hdd is all well and good, but if you can't easily search it, what's the point?
  • by mtrisk ( 770081 ) on Sunday April 09, 2006 @01:59PM (#15095358) Journal
    They should be selling their compression technology!
  • 80 gig web? (Score:3, Insightful)

    by hlh_nospam ( 178327 ) <instructor@NosPam.celtic-fiddler.com> on Sunday April 09, 2006 @02:04PM (#15095380) Homepage Journal
    That would cover about 0.0000000001% of the web, give or take a few dozen orders of magitude.
    • by AndreiK ( 908718 ) <AKrotkov@gmail.com> on Sunday April 09, 2006 @02:10PM (#15095406) Homepage
      So that would be between 10^-34 and 10^14 percent?
    • by TTK Ciar ( 698795 ) on Sunday April 09, 2006 @02:34PM (#15095518) Homepage Journal

      It's more like 0.15%, if they use the same compression and content selection criteria as the Internet Archive. If they eschewed with all non-html content (graphics files, pdf's, etc) that would go up quite a bit. If they used better compression (the Archive uses gzip) it would go up some more.

      An average crawl of the public web, minus files which are "too large" (not sure what the threshold is for that), makes about 55TB of gzipped archive. 80GB / 55000GB = 0.0014545, or about 0.15%.

      -- TTK

  • by omeg ( 907329 ) on Sunday April 09, 2006 @02:11PM (#15095414)
    "The Internet Archive Wayback Machine contains approximately 1 petabyte of data and is currently growing at a rate of 20 terabytes per month. This eclipses the amount of text contained in the world's largest libraries, including the Library of Congress. If you tried to place the entire contents of the archive onto floppy disks (we don't recommend this!) and laid them end to end, it would stretch from New York, past Los Angeles, and halfway to Hawaii."

    Internet Archive Frequently Asked Questions [archive.org]
  • by Doc Ruby ( 173196 ) on Sunday April 09, 2006 @02:12PM (#15095416) Homepage Journal
    How big is Google's index of the Web, complete with URLs of results? I could search that, only a day out of date, without a Net connection, if it fit on a HD. Maybe using Usenet to distribute it...
    • by jfengel ( 409917 ) on Sunday April 09, 2006 @03:15PM (#15095647) Homepage Journal
      Google's indexes probably run to many terabytes. Google indexes roughly a billion pages. If each page has a thousand words, and each word can be reduced to a single 32-bit number in the index, that comes to 4 terabytes. And it's probably much, much higher than that; this is a back-of-the-envelope calculation.

      Especially since there's considerable redundancy; they can't search all that data that quickly without throwing multiple computers at it. Even if you could have a local Google copy, it would run very slowly.

      To run the calculation another way: when Google Desktop indexes your hard drive, it takes up around 10% of that drive. The web contains many petabytes, and even though much of that is pictures, there are still petabytes of text to index.

      In fact, it's Google's ability to store and quickly access all that information that's more interesting than the Pagerank algorithm (for which there are other, equally good candidates). Google's ability to manage all that data is the real reason Google is able to say to the world, "Come, everybody gets free 2 gigabyte accounts! We've got more disks than you can shake a stick at!" And why they can store satellite photos to cover the globe and serve them up, and store vast quantities of free video, etc. And make caches of all of those web pages (at least the text part).
  • Pr0n? (Score:4, Funny)

    by Dante Shamest ( 813622 ) on Sunday April 09, 2006 @02:14PM (#15095432)
    Would the downloadable content include porn?

    Er, I'm asking this in order to, er, protect my girlfriend's sensibilities. Can't have her unwittingly downloading such naughty stuff you know. =)
  • by josepha48 ( 13953 ) on Sunday April 09, 2006 @02:14PM (#15095435) Journal
    I see issues of copyright coming up. Just linking to sites these days can get people into trouble, what will be the repercussions of essentially taking all this data and stuffing it on someones hard drive.
  • by SeaFox ( 739806 ) on Sunday April 09, 2006 @02:16PM (#15095445)
    I missed that eBay auction deadline again! I'd better start using FedEx for the new versions.
  • by tyroneking ( 258793 ) on Sunday April 09, 2006 @02:22PM (#15095466)
    From the website "Webaroo is a stealth-mode technology startup" which obviously means something very clever ... personally I use WinHTTrack on a small number of sites, now if someone offered pre-downloaded WinHTTrack sites ...maybe to order ...
    Anyway, more importantly - Dr Who is due back on UK TV soon I think (slightly disappointing end to last series - shame to to see Chris E leave) so here's a joke that Webaroo might like to to 'cache' ... "What do Daleks have for a snack? ...
    Dalek bread..." geddit? (thanks to a kids radio show for that one).
  • could give me Duke Nukem Forever or the next Amiga OS release.
  • by Glowing Fish ( 155236 ) on Sunday April 09, 2006 @02:30PM (#15095499) Homepage
    This actually isn't by any means a new idea.

    If you've ever written or read html, you know that html doesn't care if links start file:// or if they start html://. HTML has always been quite neutral on whether it was linking to a local file system or getting something over the internet. Of course, most people don't use html extensively for local content. So in theory, this isn't a new idea at all.

    In practice, I don't see a lot of points for it. I can imagine that some people might want a map of a new city, with clickable pictures and informations about various services there. Most features of a city map are going to stay the same for at least six months, so this is the type of thing that could be done staticly. But even with this, internet access is so widespread, that it seems like a solution for a minor problem. Also, if you want a handy city guide, it would make more sense to me to write it from scratch rather than use a cludge of cached web pages.
  • by hotspotbloc ( 767418 ) on Sunday April 09, 2006 @02:31PM (#15095503) Homepage Journal
    It's an offline, indexed database; interesting but hardly newsworthy. So unless they've broken the Shannon limit there's nothing more here than IPO fodder.
  • by sunwolf ( 853208 ) on Sunday April 09, 2006 @02:35PM (#15095521)
    How are they to justify selling other peoples' websites? What about the sites' lost ad revenues?
  • by Guppy06 ( 410832 ) on Sunday April 09, 2006 @02:37PM (#15095540)
    Now I can say that I've finished downloading all the intrawebs!
  • by Finni ( 23475 ) on Sunday April 09, 2006 @02:41PM (#15095550)
    I'm surprised no one has mentioned the word 'aleph' yet.
  • by ecloud ( 3022 ) on Sunday April 09, 2006 @02:44PM (#15095556) Homepage Journal
    Lemme guess, they're going to do that with SQL on Rails [slashdot.org]. (If you didn't see the screencast, that's part of their April 1 demo - they did a SQL query on "the internet", and claimed to have downloaded the whole internet into tables beforehand.)
  • by ruiner13 ( 527499 ) on Sunday April 09, 2006 @02:54PM (#15095581) Homepage
    A boss I once had while working on a NSF grant funded project a handfull of years ago held a meeting his first week on the job. This is his actual quote: "I'm not very good at searching the internet, can one of you put it on a CD for me?" Followed by everyone else in the meeting promptly walking out of the room shaking our heads.

    This project was a highschool biology series of CD-ROMs, which used html/javascript on a CD (worked in all browsers, all platforms). It was a great project, except that moron gave away "samples" to so many schools the market dried up, as well as feature creep which prevented him from ever declaring the CDs gold. I suspect this project is led by this moron (or a cloned similar PHB model), and will never come to fruition.

    Moral of the story is, don't let a project director hire one of his "soccer buddies" to lead a project just because his friend is unemployed. We all became that way (except for the stupid PHB who still works for the university but hasn't had a raise in 5 years... it is nearly impossible to be fired from a public university).

  • feh (Score:4, Interesting)

    by andreyw ( 798182 ) on Sunday April 09, 2006 @03:27PM (#15095687) Homepage
    Frankly, I could see a market for this *maybe* 10-12 years ago. It just doesn't make any sense now. The internet is not solely about static content. Also, the thimble of data provided in each pack will be underwhelming and perpetually out of date.

    I mean, if I know I won't be online for a week, what stops me from just CURLing or WGETing whatever I plan on reading for the next couple of weeks? And that goes only for static content like books and articles. Everything else is cannot be simply cached.
  • by kbahey ( 102895 ) on Sunday April 09, 2006 @04:06PM (#15095814) Homepage
    Look at it from an alternate perspecitive ...

    For most of North America, where high speed is fairly common and unmetered, this is not a good idea.

    For some other parts of the world, the internet is only available in dialup, and is metered. Spending hours surfing can be very cost prohibitive.

    So, if large parts of the net is available offline, I can see a market for those geographical areas, provided the cost is not prohibitive ...

  • by jacoplane ( 78110 ) on Sunday April 09, 2006 @04:47PM (#15095977) Homepage Journal
    Well, Wikipedia is licensed under the GFDL so there has never been any problem downloading [wikimedia.org] the database for it. There are even many different versions for mobile platforms and XP [infodisiac.com] (including search functionality). And the ipod [sourceforge.net] of course.
  • by Nom du Keyboard ( 633989 ) on Sunday April 09, 2006 @05:20PM (#15096135)
    80GB, huh. What's that? Two, dual-layer BluRay discs. Might make a great case for the next DVD technology.
  • by chill ( 34294 ) on Sunday April 09, 2006 @05:43PM (#15096251) Journal
    Las time I passed thru customs in London, they asked about the laptop and "do I have the Internet on there". I told him "no" but now, thanks to these dweebs, I'll have to say "Yes, I have the Internet on my laptop."

    Bastards.

      -Charles

"A car is just a big purse on wheels." -- Johanna Reynolds

Working...