The whole field has been dominated by research, i.e. the wish to make simple things complicated (in order to publish papers) as opposed to engineering, i.e. making complicated things simple (in order to produce usable software efficiently). As a result the standards are horrendously - and needlessly - complicated. The few major practical outcomes like the schema.org, json-ld and the google annotation system, are results of engineering, not research. Alas, json-ld has also taken a turn towards hypercomplexities.
Yeah, this is an unfortunate consequence of having the whole ecosystem mostly within academia, including the lack of tutorials and proper documentation (e.g. not a 500 page standard).
IMO the most interesting place right now for semantic web development is Wikidata. It's still pretty difficult for newcomers to contribute (as is the case for all Wikimedia projects) but at least it has many eyeballs and a very active community / ecosystem.
Maybe a good indicator that there is only minor (industry) need/benefit. The "biggest" Knowledge Graph is Google, but it is unclear, how much there is actually Semantic Web and how much search, ML, NLP etc..
They are all nice ideas, but the practical usecases are rare. I am skeptical of the often touted usecase in Medicine/Drug Interactions. The only time i saw it in the industry, it was not really used by the lab technicians. Because all questions the system could answer, were trivial. The promise of "the system can inference new combinations/interactions" was never fulfilled.
So it's a lot of extra work to sift through,
but I've found a lot of gold in there.
If you're looking for a simple, noise-free
way to do the semantic web, I'm very confident
that Tree Notation will enable it (https://treenotation.org/).
I've played around a bit with turning Schema.org
into a Tree Language, and think that would be a
fruitful exercise, but plenty more on the plate
first.
FWIW I've pitched this concept to W3C for 4 or 5
years to no avail yet. I think though if someone
can put together a decent prototype the idea
might start clicking.
Imagine a noise free way to encode the semantic
web with natural 3-d positional semantics. Could
be cool!
It is unclear to me what it would achieve compared to a spog (subject, predicate, object, graph) based representation like it exists in RDF based triplestores.
Yes you are right. Semantic triplets are great. I think the semantics are largely the same. Here's my work in progress argument for why this is relevant.
My take with ontologies is building consensus is hard.
Tree Notation offers a solution to the problem of: what
should we agree on for the encoding? I assume
that simpler is better, all else being equal. Then Tree
Notation is the simplest, in terms of the thing with the
fewest pieces(tokens).
To get to Tree Notation, nothing was added, only stripped. I started with an existing notation and stripped away each visible syntax token that wasn't needed. Surprisingly, not one is needed. Not one quote, parens, bracket, colon, etc.
So now if we can get consensus around going with the
simplest thing, we have got a way to agree on an whether
we should use XML, JSON-LD, turtle, etc. The simplest thing
works (which would be Tree Notation, or a close
relative—someone can rebrand the notation but the idea is
largely the same). This does not suffer from the 927
problem, as there are a few classes of things
where we do have 1 new language that is mathematically
superior and of a different kind than others (binary
notation, for example).
So after you have agreement on that encoding, versioning and
forking and merging schemas is dead simple (just use Git—in
Tree Notation all changes are semantic and noise free).
So now we've solved what encoding to use for our ontologies,
and we have a very fast and efficient way to collaborate on
them (it's just plain text and git).
That brings us to a third advantage which is more
theoretical. Tree Notation maps words/nodes to a 3-D
representation. This means that there would be an X-Y-Z
isomorphism with an ontology and the real world. I don't
really know where we go from there, but at least by this
point we've moved the semantic web idea a lot further and
can start looking at the next realm of possibilities.
A very good point. The Eiffel tower photos will only show up when people are ironically taking photos from the tower, instead of the tower. It's much too big to get a good photo of with normal point and shoot lenses. There will be related hotspots such as from across the river, but lots of random photos on the streets will be of the tower from different angles.
Whereas the Moulin Rouge is tucked into a street, and that street is the only place you can snap it.
A great innovation in camera tech would be to tag the 'main' object being photographed if it is more than xm away from the camera itself. Using GPS for direction, and using some type of algorithm for detecting the main image, I'm sure you could get a decent approximation of the target as well as the shooters location. That would make for some interesting data and you'd be able to sort on landscape vs portrait photos just by examining the GPS data.
The marked locations in high-res areas like cities are based on both most popular wikipedia articles and most popular foursquare locations in the intense area. Quite likely Alex's place is one of these highly popular foursquare locations.
One of the authors here. A few answers quickly. It does write to disk: you can either dump memory or write all changes to log (turn it on/off yourself). Sure it has a global read/write lock, with several locking strategies to select from (task-fair atomic spinlock queue or a reader-preference or a writer-preference spinlock). It is definitely meant to be a simple library. We strived to document it carefully to make usage as easy as possible. Yes, you can very easily form lists, trees or any other pointer structures. Happy to see it on the Hacker News, we never really expected that :)
The license page is quite clear why. The authors want that applications that are distributed and marketed as database systems to be used by other developers to be under GPLv3.
You might ask why they want that, and that could be an interesting read. My best guess: They are themselves developers.
I appreciate the sentiment, but I was looking forward to using this until I saw that part. I (like I assume many others) cannot use a GPL3 library (and I'm in academia). If you want any sort of traction for a library, GPL3 is not the way to go.
This is why the LGPL was created, so that you can have modifications done on your library be free-as-in-speech, but still make the library as a whole useable for a wide variety of other projects, including closed-source versions.
Having a separate requirement to email you for a free-as-in-beer license is just overly complicated for this. The more hurdles you put up for people, the fewer that will adapt the library. I think that licensing is one of those cases where is doesn't pay to be clever. Plus, what happens when you decide to stop maintaining the code? Do you want to keep getting emails for licenses years from now?
Edit: in last paragraph, I said free-as-in-speech, but meant beer (see comment below).
The default GPL is free-as-in-speech. You do not have to email for GPL. You have to email for free-as-in-beer. I assume that in case free-as-in-speech is not OK, it is also not a major hurdle to email for the free-as-in-beer version. In case emailing is a major hurdle, maybe you do not really need the free beer part.
Should we stop maintaining the code or get bored mailing free beer licences, we'll very likely change the licence to LGPL or MIT. Until then beer comes via email.
In the case of changing licenses, make sure over the course of your project maintainorship that you have the right to relicense all the code, including patches/contributions from others.
I wish it was simply licensed MIT or BSD, but congratulations on your software and sticking to your convictions.
> I (like I assume many others) cannot use a GPL3 library (and I'm in academia).
Is it copyleft in general, or the patent grant that hinders your work in in Academia? I not sure why you should be using other peoples work for free, but then go around and sue anyone who copies or improves on your work.
The project wrote down exactly what they wanted to do with their work on their license page. I say good for them. More people should do so and think what they themselves want.
I've had academic licensing offices balk at the GPL. I've had my fights with people over this, and lost. There are some specific clauses that they didn't like (this was GPL2). However, they rarely have problems with MIT/BSD licenses, so in general, that's what I try to use.
My stance is that since they did the work, the authors of the library can license it however they'd like. But, if they wanted to get more people using their library, I think that they should rethink their approach. LGPL is more appropriate for a library, where you can still have your copyleft approach for the code you wrote, while still promoting wider use.
Here's an extreme edge case... as they said, if they get tired of supporting the email to get a free-as-in-beer license, they will just open it up with an MIT/BSD style license and be done with it. That's great. But what if someone gets hit by a bus? Or someone leaves the project and moves to Antarctica? There would be no practical way to release an unencumbered version.
Really though, they can do what they want - it's their code. But licensing is one of those areas that you really shouldn't try to be clever.
> But, if they wanted to get more people using their library, I think that they should rethink their approach.
They actually don’t want as many people as possible using their library. That is not the goal when choosing the GPL. The goal is to maximize the number of free users in the world – that is, users who have the freedoms which define Free Software. Mere users is inconsequential. If users is what you desire, then by all means, choose a permissive license (MIT/BSD/etc).
I'd just make a remark that even in the GPL world everything is not as simple as it looks. There are GPL versions with _exceptions_ endorsed by RMS, for example. A long time ago I used to work on a Hobbit scheme compiler for the scm interpreter, which was promoted by RMS and became Guile later. scm had such a GPL-with-exceptions clause by RMS, which was stated clearly incorrectly. I take every chance to boast that I convinced RMS to fix the error in his own GPL version for scm :)
Yes, there's a lot of subtlety to licensing. Personally I think it should be taught in computer science schools. Open source software has really changed the dynamics of corporations. Of course we wouldn't have the open source movement without free software and imho free software is more important than ever. You seemed to have struck a nice balance with this exception that protects your interests in the database space.
And yes, getting RMS to change something is quite the accomplishment :) His ability to walk the talk is impressive.
Conditional compilation can be quite certainly used. The best way to find out the space requirements is to try out some of the examples provided. I cannot give the overhead exactly, but it can be read from the source with not too much effort. Send an email to tanel.tammet at gmail.com if you need help with that. In broad terms, we have been very careful with using memory, both for the reasons you state and the reason of getting more bang from the cache.