- What is Creative Commons?
- Metadata and CC
- Off the desktop, onto the net
Welcome, and thanks for having me here this evening. My name is Nathan Yergler. I'm a software engineer for the Creative Commons, and tonite I'd like to talk to you about 3 different areas: what CC is along with why it's a necessary institution, how we're using metadata and open source software to advance CC, and how you can share your creative works on the internet under a CC license. Hopefully along the way I'll share some information with you about the history of copyright, how it's changed, and help you understand why you might want to license your works under a CC license.
I should point out before getting started, however, that I'm not a lawyer. I'm a geek, a hacker, a software engineer. Other staff members are far more qualified to talk on certain topics, so if someone has a question I can't answer, or don't feel like I can answer authoritatively, we'll make a note of it, and I'll post the answer on my blog after checking with people with bigger brains.
"Creative Commons is a nonprofit organization that offers flexible copyright licenses for creative works."
"some rights reserved"
- Build within the current copyright system
- Provide licenses for creative works
- Develop technology for tagging those works with metadata
- Work to lower the transaction costs for using copyright-protected material
Copyright is a set of exclusive rights granted by governments to regulate the use of a particular expression of an idea or information. At its most general, it is literally "the right to copy" an original creation. In most cases, these rights are of limited duration.
In the United States, these rights are allowed under Article I, Section 8 of the Constitution, which give Congree the power "to promote the Progress of Science and useful Arts, by securing for limited Times to Authors and Inventors the exclusive Right to their respective Writings and Discoveries."
Congress first enacted copyright law under the Copyright Act of 1790, with a term of 14 years, renewable for an additional 14 year term. Congress also set forth two important conditions: registration of copyright, and deposit of a copy with the Library of Congress.
From the start copyright was not a "property right", but a balanced proposition: strong enough to encourage creators to create, "porous enough to allow ... free flow of information" and benefit society as a whole.
- Copyright term has consistently increased
- Prior to 1976, the term was 28 years, renewable once
- Copyright Act of 1976 retroactively lengthened term:
- life + 50
- 75 years for works for hire, anonymous works, etc
- Copyright Term Extension Act of 1998:
- lengthened all terms by 20 years
- nothing created after 1923 will pass into the public domain until 2019
The Copyright Act of 1976 and the Copyright Term Extension Act of 1998 are not completely without merit -- both brought the United States statutes into compliance with international agreements. The 1976 Act also codified the notion of "fair use" -- uses of works under copyright which are permissable without explicity permission from the author.
The problem here is that copyright has moved from something that was beneficial to both an individual and society as a whole, to something that increasingly benefits only the "author" (which is also increasingly a corporation instead of an individual).
- Build on the current legal system
Attribution Reuse Commercial Use
In addition to our own licenses, we also provide deeds and machine readable information to supplement the GNU GPL and LGPL.

The human readable deed for a license displays the freedoms, as well as the conditions placed on use of the licensed work. Embedded in the HTML of the deed, as well as the HTML snippet you get for embedding in your web page, is a chunk of metadata which describes the license terms and conditions in machine readable form.
metadata(n): data about data; "a library catalog is metadata because it describes publications"
Among other things, RDF helps different programs talk to each other. Imagine a world where everything had embedded RDF: When buying a plane ticket, for example, you could drag your flight itinerary onto your calendar program to add it to your calendar. You could drag a friend's top-ten songs list onto your music player, and it could try and obtain the songs for you automatically.
RDF can also be used to create more powerful search engines. Right now the only type of question you can ask a search engine is "What pages have these words in them?" When pages include RDF metadata, you will be able to ask more advanced questions like "What's the current temperature in California?" Programs can also use this information, like an alarm clock program that also displayed the current weather or a collage-making program that only used photos with permission.
Finally, metadata can be aggregated across the whole Web. A program could download all the top-ten song lists and, with the help of a pricing guide in RDF, calculate the cost of buying the most popular albums.
Metadata holds a lot of promise, but it won't be useful until people start adding it to their pages. Creative Commons hopes to help promote metadata by making it very easy for people to add metadata to their pages.
- RDF is a framework for metadata
- Defines statements that have:
- a subject ("Nathan")
- a predicate ("lives in")
- an object ("Fort Wayne")
So in this example we're talking about a statement which makes an assertion about where I live. "Fort Wayne" is the actual value of the metadata. The subject, "Nathan", describes what this metadata applies to. The predicate describes the relationship between the two.
<rdf:RDF xmlns="http://web.resource.org/cc/"
xmlns:dc="http://purl.org/dc/elements/1.1/"
xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#">
<Work rdf:about="http://creativecommons.org/">
<license rdf:resource="http://creativecommons.org/licenses/by/2.5/" />
</Work>
<License rdf:about="http://creativecommons.org/licenses/by/2.5/">
<permits rdf:resource="http://web.resource.org/cc/Reproduction"/>
<permits rdf:resource="http://web.resource.org/cc/Distribution"/>
<requires rdf:resource="http://web.resource.org/cc/Notice"/>
<requires rdf:resource="http://web.resource.org/cc/Attribution"/>
<permits rdf:resource="http://web.resource.org/cc/DerivativeWorks"/>
</License>
</rdf:RDF>
As you can see from this example, every subject, predicate and object has a fully qualified URI which describes that item. These URIs allow applications to match information about a single item from two sources in order to extend their "knowledge" about an object.
For example, the RDF above describes licensing information for the root page of the Creative Commons website. A separate RDF block could conceivably describe authorship information or subject information about the same resource. An application which knew how to read this information could very easily aggregate this information into a single display.
Metadata, in any format, does us no good unless it is available and "discoverable". Lots of file formats have specifications for embedding some set of metadata; for example, EXIF defines a specification for embedding photo information in a JPEG file. Since RDF is a format and application independent way of describing metadata, it is desirable to embed or link it to our digital files.
We'll look at HTML files first, for which there are several competing formats, including our own braindead way.
Linking to the metadata in a <link> tag is attractive because it validates and doesn't break existing client implementations. However it requires users to maintain a separate file containing the metadata and isn't supported by all readers.
Including the metadata in the head or body portion of the document has the advantage that it's a very simple approach, and should pass seamlessly through parsers. But it often doesn't, and causes document validation to fail.
Inclusion as element attributes using a rel attribute is incredibly simple, and still validates. It also isn't technically RDF, and is limited to making statements about the current page. This is often what you want to do, but isn't as useful as a general-case solution.
Which leads us to our own braindead solution, placing the RDF in an HTML comment. This approach is simple, doesn't break anything and can be included in any HTML. Of course, it also makes purists somewhat naseous (as it should), and isn't supported by all readers. It has the additional advantage that since the information is in the HTML document itself, this gets indexed by search engines.
We have defined a standard linkage string for linking file formats that don't directly support embedding such as MP3 and OGG files. The example statement above contains several pieces of information. The copyright yeah, the copyright holder, the license URL and the verification URL. The verification URL is optional, but provides a way of verifying the file has truly been licensed.
The file at the verification url contains the license information for the file, and makes assertions about a file with a specific SHA1 fingerprint.
Language-specific wrappers give us a language to address license-specific information.
>>> import ccrdf >>> cc = ccrdf.ccRdf() >>> cc.parse(rdf_block) >>> cc.works() [<ccrdf.ccrdf.ccWork instance at 0xb78a50ac>] >>> cc.works()[0].licenses() [u'http://creativecommons.org/licenses/by/2.0/de/'] >>> cc.works()[0].licenses().permits() (u'Reproduction', u'Distribution', u'DerivativeWorks')
- Building a "digital library of cultural artifacts"
- Maintain several collections:
- Audio
- Live music
- Movies, including the Prelinger Archive
- Texts
- ...anything else
- ccPublisher provides a clean front end for licensing and uploading works





The photo sharing site Flickr (http://flickr.com) allows users to select a default license which is applied to all photos they upload, as well as individually license photos they upload. They also publish a feed which consists only of licensed photos.
- Shipped first version in 2004
- Preparing beta of version 2 for release this month
- Emphasis on interoperability, customization, reconfiguration