Forgot your password?

typodupeerror

Comment: Re:Am I missing something? (Score 5, Informative) 233

by LourensV (#43515753) Attached to: Physicist Proposes New Way To Think About Intelligence

Suggesting that the purpose of intelligence in this man's random musings might be to increase the background levels of entropy for your own benefit.

That's close, I think. I am not a physicist and I skimmed the equations, but here's my take on what they're proposing. Physical systems have states, which can be described by a state vector. The state of these systems evolves according to some set of rules that describes how the state vector changes over time. They've built a simulator in which the probability of a certain state transition is computed by looking at how many different paths (in state space, i.e. future histories of the system) are possible from the new state, in such a way that the system tries to maximise the number of possibilities for the future. In one example, they have a particle that moves towards the centre of a box, because from there it can move in more directions than when it's close to a wall.

They then set up two simple models mimicking two basic intelligence tests, and find that their simulator solves them correctly. One is a cart with a pendulum suspended from it, which the system moves into an upright position because from there it's easiest (cheapest energetically, I gather) to reach any other given state. The other is an animal intelligence test, in which an animal is given some food in a space too small for it to reach, and a tool with which the food can be extracted. In their simulation, the "food" is indeed successfully moved out of the enclosed space, because it's easier to do various things with an object when it's close compared to when it's in a box. However, in neither case does the algorithm "know" the goal of the exercise. So they've shown that they've invented a search algorithm that can solve two particular problems, problems which are often considered tests of intelligence, without knowing the goal.

Then, they use this to support the hypothesis that intelligence essentially means maximising future possibilities. Another way of saying this, I think, is that an intelligent creature will seek to maximise the amount of power it has over its environment, and they've translated that concept into the language of physics. That's an intriguing concept, relating to the concept of liberty, power struggles between people at all scale levels, scientific and technological progress, and so on. I can't imagine this idea being new though. So it all hinges on to what extent this simulation adds anything new to that discussion.

On the face of it, not much. You might as well say that they've found two tests for which the solution happens to coincide with the state that maximises the number of possible future histories. The only surprising thing then is that their stochastically-greedy search algorithm (actually, without having looked at the details, I wouldn't be surprised if it turned out to be yet another variation of Metropolis-Hastings with a particular objective function) finds the global solution without getting stuck in a local minimum, which could be entirely down to coincidence. It's easy to think of another problem that their algorithm won't solve, for example if the goal would be to put the "food" into the box, rather than taking it out. Their algorithm will never do that, because that would increase the future effort necessary to do something with it. Of course, you might consider that pretty intelligent, and many young humans would certainly agree, although their parents might not. It would be interesting to see how many boxed objects you need before the algorithm considers it more efficient to leave them neatly packaged rather than randomly strewn about the floor, if that happens at all.

There's another issue in that the examples are laughably simple. While standing upright allows you to do more different things, no one spends their lives standing up, because it costs more energy to do that as a consequence of all sorts of random disturbances in the environment. The model ignores this completely. Similarly, you could argue that since in the simulation (unlike in the actual animal experiment) there is no reward for using the object, expending the energy to get it out of its box is not very intelligent at all.

Conclusion, interesting idea, but in its present state, not much more than that.

Comment: Re:Backwards (Score 3, Insightful) 147

by LourensV (#43378843) Attached to: Ask Slashdot: Linux Friendly Video Streaming?

The usual answer to questions like this is:

(1) Decide what you want the computer to do

(2) Acquire the right platform.

Syaing "I've already got [whatever platform], how do I make it do what I want?" is often not a helpful approach.

If RMS and Linus had followed that advice, GNU, Linux, and probably Slashdot would never have existed. Why should one have to buy Windows and allow customer-hostile DRM software on ones computer to be able to watch a movie easily and legally? It's your computer, and the whole point of owning it is that you can make it do what you want. Trying to do just that seems perfectly reasonable to me, and I can't see how any system that doesn't allow you to do that could be the "right platform" for anything.

Comment: Re:How modern! (Score 1) 75

by LourensV (#43369061) Attached to: R 3.0.0 Released

I can somewhat relate to the documentation issue although I believe that it is more a question of organizing the documentation.

One of the things that bothers me about the documentation is that there's often no distinction between interface and implementation. Instead of a description of what a function does, you get implementation details mixed up with what it approximately hopes to achieve, leaving you unable to see the forest for the trees.

When you mention "a fundamental problem" you mention function implementations, thus library rather than language issues. R itself is an extremely expressive, functional (or rather multi-paradigm) language that can be programmed to run efficient code. Yet it is syntactically minimalistic without unneeded syntax (as opposed to all of the scripting languages perl/python/ruby). This makes it a truly postmodern language IMO.

Well, there's only one implementation, so it's rather pointless that it could be implemented efficiently. The language specification isn't exactly good enough to create a competing, compatible implementation either. I agree that the syntax is minimalistic and that there's extremely little boilerplate, but I could really do with some way of defining data types (Python 2 is lacking there as well IMO), and namespaces...

Efficiency can sometimes be a problem but the break-even point for implementing parts in say C/C++ is only slightly different than for other languages (say perl/python) and is enabled by an excellent interface (Rcpp package).

Ah, the universal solution to problems with R: here's how to do it in some other language or software instead. Sorry for being sarcastic, but it's amazing how often effectively that advice showed up whenever I searched the web for a solution to some problem I encountered with R.

As an example of my experience, I use JAGS to fit models to data, and JAGS wants to have the model as a text file description. My model has a node for every combination of some 13000 sites and 11 years, and the text file gets to several tens of megabytes depending on model options. Creating it is basically a matter of running through all the combinations of sites and years, looking up some additional data, and spitting out a line of text describing them. My first implementation was very naive, nested for loops that essentially did a nested loop on the data. It generated output at several tens of kilobytes per second, getting slower and slower as it went on. I managed to speed it up by preallocating memory (R seems to not double the capacity of a vector when it runs out, as the C++ STL does, but add a constant extra amount, so that growing a vector made the loop run in quadratic time, except that when measured it actually seemed to be exponential, for who knows what reason.), pre-sorting data and changing to a merge join, and vectorising as much as possible. It now does about a megabyte per second, which is fast enough for my purposes. However, the code is now completely unreadable, and it's still not anywhere near what the hardware can do (PostgreSQL does the equivalent nested loop in less than a second). R turned what should have been a trivial programming task into a frustrating adventure, and the result is still not very good.

For myself the biggest change to make was to start thinking in functional concepts coming from a procedural background. Much of R criticism IMO stems on a failure to realize conceptual differences between functional and procedural programming. Another problem that might spoil the impression of R sometimes is the plethora of packages of highly varying quality.

True, but this is really another instance of the don't-do-it-in-R solution, because those functional programming functions effectively just run your loop in C, rather than in R (if they don't forward the whole operation to a C scientific maths library), which makes the performance bearable. If R were really a multi-paradigm language, then you would be able to solve a problem procedurally as well if it happened to be the best way to do it.

Comment: Re:How modern! (Score 4, Insightful) 75

by LourensV (#43367053) Attached to: R 3.0.0 Released

I recently switched my scientific programming from R to Python with NumPy and Matplotlib, as I couldn't bear programming in such a misdesigned and underdocumented language any more. R is fine as a statistical analysis system, i.e. as a command line interface to the many ready-made packages available in CRAN, but for programming it's a perfect example of how not to design and implement a programming language. It's also unusably slow unless you vectorise your code or have a tiny amount of data. Unfortunately, vectorisation is not always possible (i.e. the algorithm may be inherently serial), and even when it is, it tends to yield utterly unreadable code. Then there is the disfunctional memory management system which leads you to run out of memory long before you should, and documentation even of the core library that leaves you no choice but to program by coincidence.

As an example of a fundamental problem, here's an R add-on package that has as its goal to be "[..] a set of simple wrappers that make R's string functions more consistent, simpler and easier to use. It does this by ensuring that: function and argument names (and positions) are consistent, all functions deal with NA's and zero length character appropriately, and the output data structures from each function matches the input data structures of other functions.". Needless to say that there is absolutely no excuse for having such problems in the first place; if you can't write consistent interfaces, you have no business designing the core API of any programming language, period.

Python has its issues as well, but it's overall much nicer to work with. It has sane containers including dictionaries (R's lists are interface-wise equivalent to Python's dictionaries, but the complexity of the various operations is...mysterious.) and with NumPy all the array computation features I need. Furthermore it has at least a rudimentary OOP system (speaking of Python 2 here, I understand they've overhauled it in 3, but I haven't looked into that) and much better performance than R. On the other hand, for statistics you'd probably be much better off with R than with Python. I haven't looked at available libraries much, but I don't think the Python world is anywhere near R in that respect.

Anyway, for doing statistics I don't really think there's anything more extensive out there than R, proprietary or not, although some proprietary packages have easier to learn GUIs. In that field, R is not going to go anywhere in the foreseeable future. For programming, almost anything is better than R, and I agree that those improvements you mention are not doing much to improve Rs competitiveness in that area.

Comment: Re:I get the impression that (Score 1) 180

by LourensV (#42807641) Attached to: Python Gets a Big Data Boost From DARPA

You're not picking on me, you're arguing your point. That's what this thing here is for, so no hard feelings at all.

I'll readily admit to not knowing Fortran (or much Python! ;-)); I'm a C++ guy myself, having got there through GW-Basic, Turbo Pascal and C. I now teach an introductory programming course using Matlab (and know of its history as an easy-to-use Fortran-alike), and I use R because it's what's commonly used in my field of computational ecology. I greatly dislike R, and I'm not too hot on Matlab either, as the first thing you should do when programming is to decide what the program is about, and to express that you need type definitions, which Matlab nor R have. From a very quick look around, at least recent versions of Fortran do have them, so that's good in my view. As for the RAM limitations in R, it seems to me that that is actually a consequence of the vectorised style of programming and the lack of lazy evaluation: you tend to get either unreadable code with enormous expressions, or a lot of temporaries which eat up lots of RAM.

Replying to your other post, I was thinking of the many hundreds of millions that are spent on satellites and the dedicated compute clusters for weather forecasting. I've also heard of budget issues and lack of replacement satellites in that area, but it's still a lot of money compared to most grants. Over here it's big news if someone manages to get a million Euro grant, spread over a couple of years, while NOAA has a 4.7 billion USD yearly budget. Of course they do other things than weather forecasting, I'm comparing an entire government organisation to a single scientific investigation here, but it's a different level for sure.

In the end, I suspect that we're simply in different fields, and therefore seeing different things. Generally speaking, the more physical the field, the more tech-savvy the scientists, and the more computer use. In my institute, Microsoft Excel is by far the number one data processing tool...

Comment: Re:I get the impression that (Score 5, Insightful) 180

by LourensV (#42806555) Attached to: Python Gets a Big Data Boost From DARPA

You're probably right, but you're also missing the point. Most scientists are not programmers who specialise in numerical methods and software optimisation. Just getting something that does what they want is hard enough for them, which is why they use high-level languages like Matlab and R. If things are too slow, they learn to rewrite their computations in matrix form, so that they get deferred to the built-in linear algebra function libraries (which are written in C or Fortran), which usually gets them to within an order of magnitude of these low-level languages.

If that still isn't good enough, they can either 1) choose a smaller data set and limit the scope of their investigations until things fit, 2) buy or rent a (virtual) machine with more CPU and more memory, or 3) hire a programmer to re-implement everything in a low-level language and so that it can run in parallel on a cluster. The third option is rarely chosen, because it's expensive, good programmers are difficult to find, and in the course of research the software will have to be updated often as the research question and hypotheses evolve (scientific programming is like rapid prototyping, not like software engineering), which makes option 3) even more expensive and time-consuming.

So yes, operational weather forecasts and big well-funded projects that can afford to use it will continue to use Fortran and benefit from faster software. But for run-of-the-mill science, in which the data sets are currently growing rapidly, having a freely available "proper" programming language that is capable of relatively efficiently processing gigabytes of data while being easy enough to learn for an ordinary computer user is a godsend. R and Matlab and clones aren't it, but Python is pretty close, and this new library would be a welcome addition for many people.

Comment: Re:Editorial work? (Score 4, Informative) 162

by LourensV (#42626063) Attached to: Mathematicians Aim To Take Publishers Out of Publishing

Unfortunately the vast majority of posters have never had any work published and make the false assumption that its all gravy for the publishers. Editing anything - scientific papers, manuscripts, text books is a considerable effort, far more than spell check in word. Layout is also important to make best use of space and present the work clearly to the reader. So the text (including tables and figures) that the author sends to the publisher do not equate to editiorial review or layout work. All costs must also be spread over the expected readership of the journal, which in the case of most scientific journals is not a very large audience.

Last time I had something published in a peer-reviewed (Elsevier) journal, I sent them a LaTeX file using their stylesheets, all formatted and ready to go (and boy are tables a b*tch in LaTeX!). They don't give you the actual styles they use to format papers, but presumably the ones they do make available are compatible, so there was very little work on their end. Then, I went and did it all a second time myself (the published styles are not very readable, and I wasn't sure about copyright issues), so that I could publish a readable version as a preprint for free access through my institution's repository (which is allowed). Granted, most people in my field will just send in Word files and some images, and someone has to arrange them neatly. That's not that big a job though, and they're certainly not going to make your pictures prettier (unless you pay them a hefty fee for that service) or do much more than running a spelling checker. If it's badly written, the peer reviewers will politely suggest you (note: not the publisher) get a native speaker to fix it up for you. I know several colleagues (none are native speakers) who have some or all of their papers checked for proper English by professional editors before submitting them, at their own expense.

In the case proposed here, there is also the added need for peer review with checks and balances, not just peer review by the guy who has plenty of free time because he has nothing else going on. Who is going to run this process? Who is going to prod slow reviewers? What about the final decisions to accept or reject? The opporunity for bias in decision making is going to be far higher. While academics are involved in the process now, the publisher (in theory) acts as last guarantor of good behavior.

The editor, like they do now? As far as I know, editors at least in the West generally do the job for the reputation capital and as a kind of community service, not for the money. I could see people volunteer some of their time as a (co-)editor just for the credits. Anyway, even an open access journal could charge a small submission fee to cover this, or it could be subsidised by bodies like the NSF.

Comment: Re:AKA pumped storage (Score 1) 242

by LourensV (#42624733) Attached to: Belgium Plans Artificial Island To Store Wind Power

Using an artificial island is an interesting idea. If you're using off-shore wind farms then the power generation is local and you save on infrastructure and transmission costs; you avoid destroying valuable mountainside (although at the expense of destroying valuable sea bottom).

Nothing to destroy there, the North Sea is pretty much an industrial wasteland. Fish populations were decimated long ago, all that's left is oil drilling rigs, shipping lanes, pipelines and wind farms. So an artificial island more or less is not going to be a problem in that respect, and the Low Countries don't have a lot of mountainside, valuable or not. Actually, it's not inconceivable that birds might breed on the island, away from most human influence. The wind turbines may be a problem for them though, not sure.

Comment: Re:At least one has merit... (Score 2) 97

by LourensV (#42618347) Attached to: Europe's Got Talent For Geeks

Well, that depends on your definition of complex I guess. I'm currently attempting to model the distribution of plants in space and time. That includes processes like dispersal, local colonisation and extinction, plant physiology, human influence, and species interactions. I haven't got all of them in yet, but to my eyes it's not simple.

I'm not interested in creating any kind of intelligence, I just want to know which model describes my data best, how good it is, why it is better than other models, what that says about reality, and how I can improve it further. That'll require some intelligence, which will come from me and my colleagues. And yes, I'm using Bayesian inference (MCMC model fitting using Gibbs sampling) to get there, because it's a good tool for the job.

Does that mean that my model is going to be any good at describing reality? No idea. That depends on how much data I can obtain to put in it, how accurately and comprehensively the processes involved are modelled, and to what extent the process I'm modelling is inherently random and unpredictable. So we'll see how it goes. However, giving up because "Statistical modeling cannot ever model complex things" strikes me as simplistic and defeatist.

Honestly, looking at your posts in this thread, I get the feeling that I'm seeing a case of Clarke's First Law here. But maybe that's just me.

Comment: Re:At least one has merit... (Score 1) 97

by LourensV (#42618101) Attached to: Europe's Got Talent For Geeks

For sure there's a lot of money going around at the European level, and a lot of it is not spent on research but on all sorts of processes around it. Quite a bit is probably wasted, other things are just a consequence of trying to organise anything at that large a scale. Anyway, I just wanted to note that I agree with you on the probability of a true artificial intelligence being created any time soon, but I don't think that any of these projects require one or propose to create one, so your argument is beside the point. The things they do propose do seem to my only somewhat informed eyes to be within the realm of the possible given the current state of the art, and worth a try.

Comment: Re:At least one has merit... (Score 4, Insightful) 97

by LourensV (#42614921) Attached to: Europe's Got Talent For Geeks

The Graphene one. The others are just the usual BS from people clueless about how computers work and what they can and cannot do.

Spoken like a true programmer or sysadmin with no knowledge of statistics, modelling, machine learning or data analysis. I know, because I was one (and I still write code and maintain servers). But I've also moved into the above fields, and it's a completely different world. The discrete math and logic you use in programming are completely useless here, and the things you can do and the hurdles you come across are very different from the ones you see in programming. Of course, you still have to implement your models and analyses, and you get all the usual issues there (plus things like numerical instability), but even if the software is running fine you'll have things like parameter identifiability, difficulties in comparing models, lack of data in the places where you need it, conceptual problems with the models that can only be solved by making them more complex, which leads to lack of data problems and the need for massive amounts of compute power, and so on. These are the things they will be trying to tackle, and they have nothing to do with the limitations of Turing-style computers.

I do remain sceptical about having a chat with a Turing-level AI any time soon, but data analysis, modelling and inference methods are getting better and better (see Google Search, Watson) and I don't think that continued research into these things is a waste of money. Neither do Google, Facebook, Microsoft, the US government, and the EU apparently.

Finally, here's another EU project in this direction that is both scary and interesting.

Comment: Re:Um... (Score 3, Informative) 75

by LourensV (#42473577) Attached to: Blue, Not Red: Did Ancient Mars Look Like This?

In Kim Stanley Robinson's Mars trilogy, the northern ocean is filled with fresh water from the molten polar ice cap, while the rivers take up salt from the rocks they flow over, so there are salty rivers flowing into a fresh water ocean. I'm not sure how realistic that is, but it doesn't seem completely illogical.

As artist impressions go, I prefer this one, by Daein Ballard over the one in the article.

Comment: Linux Foundation and graphics/wifi drivers? (Score 4, Funny) 113

Maybe the Linux Foundation (or someone else, they're the first that come to mind) could do a similar thing to raise money for improving the Linux graphics and wireless stacks? How much improvement could we get for a million USD? Or perhaps there are individual developers out there who would do what Poul-Henning Kamp did? I'd be happy to contribute to such an initiative. Kickstart it?

Comment: Re:Canonical does have a compatible/certified list (Score 2) 352

by LourensV (#42413017) Attached to: Ask Slashdot: Linux-Friendly Motherboard Manufacturers?

I'm typing this on a Dell Latitude E6410, which is on that list (albeit with nVidia graphics I think, but Intel support is better, right? That's why I ordered it, anyway). When I first got this machine, it was also on the list, but Ubuntu 10.04 LTS (the most recent LTS) wouldn't even boot on it, just gave a black screen. Apparently there were multiple issues with the Intel graphics drivers, with both the E6410 and the E6510. Now it did seem that Canonical was giving those bugs some attention, but it still took many months for them to be fixed for most users. Then there was one last patch and it started working for me as well...until the next (ordinary, stable) update which broke it again. I ended up running 11.04 with a backported kernel that I didn't dare upgrade.

Then the touchpad (ALPS, not Synaptic) wasn't recognised as a touchpad, which they "solved" with a patch that sent a magic command combo to the device to switch it into imPS/2 mode. Result: scrolling worked, but it's still not recognised as a touchpad, and you still cannot configure it as such.

So yes, Canonical has a list, but I'd interpret that as "we'll try a bit harder to fix these machines, but they're otherwise just as broken as everything else", not as "tested and working".

Generally speaking, I'm seeing a lot of comments saying that there are no issues with server hardware. The OP didn't mention it, but it seems he/she was asking for desktop hardware, whose most critical functionality is (non-trivial) graphics hardware and, if it's a laptop, wifi. In my experience those things remain difficult for modern Linux kernels, especially on new hardware (i.e. anything you bought in the last year, maybe two), and there doesn't seem to be much progress either. As others have said though, Intel hardware seems to be your best bet.

Democracy is a form of government that substitutes election by the incompetent many for appointment by the corrupt few. -- G.B. Shaw

Working...