Want to read Slashdot from your mobile device? Point it at m.slashdot.org and keep reading!


Forgot your password?

Slashdot videos: Now with more Slashdot!

  • View

  • Discuss

  • Share

We've improved Slashdot's video section; now you can view our video interviews, product close-ups and site visits with all the usual Slashdot options to comment, share, etc. No more walled garden! It's a work in progress -- we hope you'll check it out (Learn more about the recent updates).


Comment: Re:It's just hard work and machine learning (Score 1) 68

by qpqp (#49337193) Attached to: Ask Slashdot: What Happened To Semantic Publishing?

There will always be some outliers/exceptions, but it should be possible to sufficiently specifically define the rules and vocabulary of a given system, possibly by breaking it further down into facets/perspectives and then mapping the relations and constraints.
So then you could have many ontologies, which will gradually converge over time. I'm talking long-term, of course. The annotation part could also require consensus, or vetting, by multiple recognized entities. All in all, the result would still be more or less a fluid body, but then so is everything around us, as the only constant in our world is that everything is changing.

And I agree with you that ML and annotation/classification & co. are complimentary tools. And it will take a lot of work to have end users semantically enrich their output.

Where I disagree is in your definition of a model, which is not necessarily an incorrect representation. It's just a representation, the level of detail varies from use-case to use-case.

So anyway, the big question is how to get there...

Comment: Re:It's just hard work and machine learning (Score 1) 68

by qpqp (#49333613) Attached to: Ask Slashdot: What Happened To Semantic Publishing?

And if I've misrepresented rockmuelle, or misunderstood your question, qpqp, it's because I don't have an exact model of what you're saying.

Come now, don't blame everything on me!

What I meant by exact model is of course a predictable, and in a sense deterministic process; inasmuch as that is possible for the given case.
Even with machine learning you create a representation of the surveyed system, but this model will (currently, and in most cases) always be an approximation.
By mapping concepts, their (often ambiguous) meanings, usage scenarios and other relations from different areas to each other, supported by these approximations, it should in time be possible to avoid the issues related to the fuzziness and create a truly smart and adaptive system.

Of course, our universe (as far as we know) is (inherently?) non-deterministic. And obviously, if that is so, you'd have to somehow cheat (e.g. be able to observe our universe from more than the 4 dimensions we can perceive) to get a truly exact model, assuming that some (reachable) abstraction point is deterministic.
What I'm suggesting is that with some effort it should be possible for us to come up with something with the ability to understand something (like you did with my question, despite lacking an exact model;) ). And while ML is quite crude and more like a sledgehammer, an accurate definition is more like a chisel. At least with respect to the model(s).
Assuming such a system is created, it will have similar limitations like humans with regard to the ability to understand something, as we do not know everything as far as I am aware.

But anyway, the librarians didn't have the technical capability to create such a multi-dimensional mess like we currently can, so maybe these things we're talking about just have their own math that we just need to understand the proper rules for. It's all metadata anyway, but currently, I guess the closest we have to an exact model is in the hands of the NSA...

Comment: Re:I hope "semantic" != "annoying popups" (Score 1) 68

by qpqp (#49332497) Attached to: Ask Slashdot: What Happened To Semantic Publishing?

I'm not sure I really follow your argument

Well the other services (except for email, obviously) are largely run by volunteers and don't even have ads (spam notwithstanding).

Quality in the things that are not important to contributors, but are important to many of the people who do not contribute? Not so high.

Now I'm not sure that I follow. Sure, there's lots of stuff that lacks the polish of countless missing man-hours, but we've all come a really long way since the 80s/90s. I'm sure we'll get there if we don't fuck up before that.
I've also seen lots of examples of features that were unimportant to the contributors, but since there was an itch to scratch e.g. in getting recognition from their users, a similar level of rigor was applied to satisfy them.
(Certainly, there's lots of negative examples too, but the point stands, that there was little "physical" value that some devs received for their work and yet still the projects thrive(d). I was, of course, assuming that you meant money when you said "paying for things" in your original post.)

Comment: Re:It's just hard work and machine learning (Score 1) 68

by qpqp (#49332419) Attached to: Ask Slashdot: What Happened To Semantic Publishing?
I agree that the tools are currently insufficient (though quite powerful, e.g. Protege), but I also believe that it's quite possible to achieve a high level of accuracy by combining better tools, dividing the problem space and working on killer features that require this higher level of abstraction.
Ideally, people (at first for industrial applications) would recognize the need for a proper machine-readable representation of the different states of a specific environment, so that eventually the different ontologies could be mapped to each other.
An exhausting (i.e. universal) categorization of all possible states (of everything) is largely unnecessary, as even now, when we communicate with each other, use the respective vocabulary of the specific topic/area/system and only (comparatively) rarely need to "interface" or interesect with other areas/vocabularies, e.g. when we want to draw parallels to a similar concept in a different system. With time, I'm sure we'll could even get to a meta-ontology and evolve our language and understanding accordingly.

Comment: Re:It's just hard work and machine learning (Score 1) 68

by qpqp (#49330637) Attached to: Ask Slashdot: What Happened To Semantic Publishing?

[...] the current Big Data and Machine Learning techniques [...] trump the whole categorization and knowledge extraction / data mining process [...]

Could you please explain, how a statistical approximation can trump an exact model? I think that big data & co. is a step in the right direction with the means that we currently have available and that we'll get there eventually. There's too many benefits that would result from doing it properly to neglect the required effort.

Comment: Re:Clever? Yeah, right. (Score 1) 68

by qpqp (#49330577) Attached to: Ask Slashdot: What Happened To Semantic Publishing?

But, please, don't give me a blinking and whirling semantic web whereby every move of the mouse updates your AHDH-laden site.

FTFY. The semantic web is a vision that has little to do with what you described:

According to the W3C, "The Semantic Web provides a common framework that allows data to be shared and reused across application, enterprise, and community boundaries".[2] The term was coined by Tim Berners-Lee for a web of data that can be processed by machines.[3] While its critics have questioned its feasibility, proponents argue that applications in industry, biology and human sciences research have already proven the validity of the original concept.[4]

(From the related Wikipedia article.)

Comment: Re:I hope "semantic" != "annoying popups" (Score 3) 68

by qpqp (#49330489) Attached to: Ask Slashdot: What Happened To Semantic Publishing?

It's our fault.

It's Eternal September all the way down.

Where people are in the habit of paying for things, the providers of those things worry about quality.

Bullshit. The Internet was a fine place before youtube and google and continues to be so now. It just became more convenient, for everyone. Including the parasites.
Go look at other segments of the Internet: email, ftp, irc, jabber, torrents... dominated by quality-oriented mentality!
Look at linux (the systemd debacle notwithstanding;) ), BSD, the open source community in general... Sure, a lot is paid for, but even more is driven by enthusiasm first and foremost.

Comment: Re:Not considered a real risk - at least, until no (Score 1) 324

by qpqp (#49158889) Attached to: Ask Slashdot: How Does One Verify Hard Drive Firmware?
So, you're saying that it's so simple to get SHA256 collisions that thousands of people getting sued for torrenting can fuck these copyright companies right over?
I don't think I quite believe you and last time I checked I needed quite a server farm to (reliably) produce one collision in a meaningful amount of time.

"Success covers a multitude of blunders." -- George Bernard Shaw