From Wikibase Personal data
Jump to navigation Jump to search


Data license

Our data content (in the Item and Property namespaces) is covered by the Creative Commons CC0 Waiver (see Creative Commons CC0 License (Q38)), which states that you are free to share (copy, distribute and transmit) and remix (adapt) the work. Our terms of use covers the ethical and social norms for attribution that we expect from research, journalistic and activist communities. For any reuse or distribution, we would appreciate that you make clear to others the CC0 terms for the work. The best way to do this is with a link to this web page. Again, this license is fairly permissive, so you shouldn't feel paralyzed by legalese if you are interested in using or redistributing our data content. Just be sure to read the terms of use and if you have any questions about fair use, don't hesitate to contact us. A detailed FAQ on CC0 can be found here, which was written by the Wikimedia community.

Why not a copyleft license on data?

We came to the conclusion that this was the best licensing model after some hesitation. This hesitation stemmed from the hope that it would be possible to construct a copyleft data license, or reuse the only existing one (Open Database License (ODbL) v1.0 (Q1583)). We became convinced this was futile and thus a bad idea through reading these references: List of references on data licenses (Q2008).

We want to highlight a key quote from Luis Villa (Q2906):

Eben Moglen has often pointed out that anyone who attacks the GPL is at a disadvantage, because if they somehow show that the license is legally invalid, then they get copyright’s “default”: which is to say, they don’t get anything. So they are forced to fight about the specific terms, rather than the validity of the license as a whole. In contrast, in much of the world (and certainly in the US), if you show that a database license is legally invalid, then you get database’s default: which is to say, you get everything. So someone who doesn’t want to follow the copyleft has very, very strong incentives to demolish your license altogether.

Essential to understanding the importance of this quote is that databases are not universally recognized as protected through copyright, unlike software.

So what should we do instead, especially given that we want to recognize the work of everyone in this space? The tl,dr of Public licenses and data: So what to do instead? (Q2907) is "say no to licenses, say yes to norms". This is what we tried to define above, linking to our terms of use.

But you could have used my favorite license instead!

This is unlikely:

  • ODbL (copyleft database licensed, used by OpenStreeMap):

Unfortunately, many people have a good-faith desire to see copyleft-like results in other domains. As a result, they’ve gone the wrong way on this point.

ODbL is probably the most blatant example of this: even at the time, Science Commons correctly pointed out that ODbL’s attempt to create database rights by contract outside of the EU was a bad idea.

Unfortunately, well-intentioned people (including me!) pushed it through anyway. Similarly, open hardware proponents have tried to stretch copyright to cover functional works, with predictably messy results.

— Luis Villa (Q2906), one of the authors of the ODbL, who worked for OpenStreetMap at the time
  • CC-BY-SA is likely to work against your goals if you are based in Europe and select countries:

Where the Licensed Rights include Sui Generis Database Rights that apply to Your use of the Licensed Material...

It is important to remember that sui generis database rights exist in only a few countries outside the European Union, such as Korea and Mexico. Generally, if you are using a CC-licensed database in a location where those rights do not exist, you do not have to comply with license restrictions or conditions unless copyright (or some other licensed right) is implicated. Note that if you are using a database in a jurisdiction where you must respect database rights, and you receive a CC-licensed work from someone located in a jurisdiction without database rights, you should determine whether database rights exist and have been licensed.

Other datasets

For convenience, we maintain a list of related datasets and their licenses.

Software license

MediaWiki and the various extensions used for our hosting software have licenses listed at Special:Version.

The user-contributed software (for instance the user scripts and the gadgets in the MediaWiki namespace) are by default contributed under CC-BY-SA. This is due to Wikidata having that same license for its software and the expectation that contributors will cut and paste freely from Wikipedia and Wikidata.

Wikipedia and Wikidata encourage contributors to double license original contribution under the GPL, and we do the same here, on a per file basis.

We think there might be interesting options to explore with copyleft mixed licenses. We would love to have a deeper conversation with you on those topics. Please do reach out.