Follow Slashdot blog updates by subscribing to our blog RSS feed

 



Forgot your password?
typodupeerror
×

Submission + - Artemiy Margaritov Wins €9000 In the First 10x Hutter Prize Award

Baldrson writes: The Hutter Prize for Lossless Compression of Human Knowledge has now awarded €9000 to Artemiy Margaritov as the first winner of the 10x expansion of the HKCP, first announced, over a year ago in conjunction with a Lex Fridman podcast!

Artemiy Margaritov's STARLIT algorithm's 1.13% cleared the 1% improvement hurdle to beat the last benchmark, set by Alexander Rhatushnyak. He receives a bonus in proportion to the time since the last benchmark was set, raising his award by 60% to €9000.

Congratulations to Artemiy Margaritov for his winning submission!

Submission + - Rule Update: 5'000 Byte Per Day Relaxation of Award Threshold

Baldrson writes: Marcus Hutter lowers the bar for The Hutter Prize for Lossless Compression of Human Knowledge. It has been a year since the Hutter Prize increased by a factor of 10 and there have been no new entries. By decreasing the compression threshold for a prize award by 5,000 bytes per day, Hutter hopes to increase the rate and fairness of prize awards, hence progress toward artificial general intelligence. From the Hutter Prize FAQ: Why do you grant a 'discount' of 5'000 Byte per day?

The contest went big early 2020, but so far no-one was able to beat the baseline. The discount has been introduced to ease participation and guarantee eventually at least one winner. The discount is around 1.5% per year, so should allow a first winner within a year, or at most two. The reason for the incremental discount is to prevent a quick win by whoever notices the discount first.

Comment Re:just use BERT? oh wait... (Score 1) 65

retchdog writes:

If BERT can magically sneak in some spooky black magic, then you're admitting that enwik9 is not an adequate data set for testing, end of story.

No. All BERT must do is beat the Large Text Compression Benchmark -- a benchmark that uses enwik9 as the corpus.

Comment Re:Annoying (Score 1) 65

Megol asks:

But let's assume the information is important, why should the raw text be repeated instead of the myriad variants that express the same thing in a different way?

For the same reason scientists don't throw out the raw measurements in their experiments just because it departs from theory. One can assert that the form of the expression is irrelevant to the underlying knowledge, but this is a sub-theory based the agent's comprehension of the entire corpus. This becomes particularly relevant when attempting to assign latent identities to sources of information so as to model their bias.

Comment Re:just use BERT? oh wait... (Score 1) 65

Separating the 2 questions, 1) "Why not just use BERT?" and 2) "Why exclude parallel hardware?"

1) BERT's algorithmic opacity leaves open an enormous "argument surface" regarding "algorithmic bias". There are >100,000 hits for "algorithmic bias" due to inadequately principled model selection criteria used by the ML community, thence google's algorithms. Algorithmic Information Theory cuts down the argument surface to its mathematical definition of natural science:
  Solomonoff Induction. Basically, it is provable that if classical (Cartesian) causality holds, distilling a dataset of observations to its Kolmogorov Complexity means you've factored out all the bias you possibly can if you claim to be "data-driven" in your policies. The BERT folks have yet to present a compressed enwik8 (100M) or enwik9 (1G) to demonstrate it is a superior unbiased model. They should do so. They don't have to submit an entry to the Hutter Prize for this, so they are free to use Google's entire TPU farm, or whatever. Let this be a challenge to them if they are serious about dealing with accusations of bias in their algorithms.

2) If one reads the relevant Hutter Prize FAQ answer, it becomes apparent that even a 2080Ti (or other GPU/TPU) may not be affordable to some of the best minds on the planet.

Submission + - The Hutter Prize Increased To 500,000€ and 1GB (hutter1.net)

Baldrson writes: First announced on Slashdot in 2006, AI professor Marcus Hutter has gone big with his challenge to the artificial intelligence community. A 500,000€ purse now backs The Hutter Prize for Lossless Compression of Human Knowledge. Contestants compete to compress Wikipedia to its essence. The 1 billion character excerpt of Wikipedia called "enwik9" is approximately the amount that a human can read in a lifetime.

Hutter's challenge is an advance over the Turing Test. Devised by the famous AI theorist, Alan Turing, a chat bot must be able to fool a human. It is pass-fail. Hutter's prize incrementally awards distillation of Wikipedia's storehouse of human knowledge to its essence. This judging criterion derives from a mathematical theory of natural science, informally known as "Occam's Razor". Formally it is called Algorithmic Information Theory or AIT. AIT is, according to Hutter's "AIXI" theory, essential to Universal Intelligence.

Hutter's judging criterion is superior to Turing's in 3 ways:

1) It is objective,
2) It rewards incremental improvements,
3) It is founded on a mathematical theory of natural science.

Detailed rules for the contest and answers to frequently asked questions are available.

Submission + - TIBET(tm) 5.0 Release, Intro Video, Docs and GitHub Repo (medium.com)

Baldrson writes: Scott Shattuck at Medium.com reports: "TIBET 5.0 is now available." A dream? To some. A NIGHTMARE TO OTHERS — particularly those counting on TIBET 5.0 to be the DukeNukem Forever of web application frameworks. See the introductory "Why TIBET?" tl;dr; video. Or the documentation and white papers for the literate. Or, the TIBET repo for the obsessive.

Comment Shortage of STEM Workers? (Score 1) 358

Many if not most of the responses here posit that the degree requirement, even if not directly related to the job, is a cheap, if crude, way to filter out a deluge of job applicants.

If there is such a shortage of STEM workers that it is necessary to import so many that Silicon Valley has become majority Asian in less than a generation, it is rather difficult to justify such crude measures.

In reality, what is going on is that capturing positive network externalities has increasingly been the VC business model -- not invention. This creates monopoly profits that insulates management from bad hiring decisions. Rather than letting those bad decisions go to waste, Asian cultures, which can smell economic rent 10,000 miles away, ramped up their diploma mills (a diploma being the equivalent of a taxi cab medallion in terms of rent seeking), targeted the network effect monopolies, targeted the hiring authority of those companies, imported their "degreed" coethnics in huge numbers under the H-1b program, and focused more and more of the VC world on the rent-seeking network effect business model. The "guest workers" are then on a green card track which, when obtained, raises their value in the dowry market by tens of thousands of dollars. Everyone wins, except Western civilization and the folks that built it.

Comment Universal algorithmic IQ test (Score 1) 384

“Sandra Wachter, a researcher in data ethics and algorithms at the University of Oxford, said: “The world is biased, the historical data is biased, hence it is not surprising that we receive biased results.””

The single most subversive thing that can be done in the present environment is to financially back lossless compression prizes. One such prize is the Hutter Prize for Lossless Compression of Human Knowledge — although it needs to be expanded to include all of Wikipedia. Perhaps a more immediate prize would be based on compressing a wide variety of social science data. Sandy can then show everyone how smart she is by modeling the “bias in the data” so as to better predict it — which is exactly why compression is _the_ unbiased universal algorithmic IQ test.

See: https://vimeo.com/17553536

Comment Who Will Protect the Internet Archive Itself? (Score 5, Interesting) 590

If you have a domain name under which you have a lot of content -- an example is kuro5hin.org -- and, after a decade or so you find yourself impoverished and stressed to the point that you can't renew the domain registration (as did Rusty Foster), a domain squatter jumps on it and holds it hostage for thousands of dollars. When that happens, frequently even "The Wayback Machine" is told to deep-six the archived content by the simple expedient of placing a robots.txt file in the home directory of the hijacked domain. "The Wayback Machine" then dutifully removes public access to the content. OH but the fun doesn't stop there! So now let's say you fork over the ransom money to the domain squatter, get the domain name back and remove the robots.txt. Of course "The Wayback Machine" then restores public access to all those articles... right?

WRONG!

archive.org does keep it stored and it is accessible to those with insider status, but no more public access EVER.

There really is value in hoarding history and if you can get away with it by doing it "on accident" all the better!

Slashdot Top Deals

Polymer physicists are into chains.

Working...