Engage is going to have to embrace an incredible amount of diversity in terms of how data is structured. Each museum's collection is different, and it is a daunting task to attempt to unify all types of collections with a single schema. Rather than creating a "one size fits all" approach at the database level, the approach we'll take in Engage is to enable flexible schemas.
Without being able to design a predetermined schema during the design phase, a traditional SQL-based relational database is likely inappropriate.
- A JSR-170 compliant Java Content Repository (JCR). This stores data in a version-managed hierarchy of "nodes" each of which may store a flat collection of key-value pairs. Querying is performed via reduced XPath or a limited SQL subset. This choice limits the hosting language to Java, and JCR mandates no wire protocol beyond a recommendation for WebDAV. Also, many useful aspects of the implementation (transaction support, lazy result sets, etc.) lie outside the specification for JSR-170, though improved support is coming in JSR-283. The full JSR-170 spec, however, only appears in one implementation, Apache Jackrabbit.
- Fedora Repository is part of the overall Fedora Commons repository project. Whilst Fedora Repository provides support for internal XML-oriented storage attached to a "Digital Object", it excels particular in management of binary data streams - likely to be a suitable repository for storing extremely large image or video streams. Whilst Fedora itself is implemented in Java, it specifies (XML-based) RESTful protocols for query and update. Also notable is the inclusion of Mulgara, an RDF-based triplestore, as part of the package (this somewhat answers the same semi-structured data requirement as CouchDB, for example).
- CouchDb's Apache Site
- What is CouchDB Ex-Powerpoint presentation online from Damien Katz
- Blog posting about using CouchDb from Python
- Another post about Couch and Python
- CouchDb's author weighs its pros and cons
- Sam Ruby on Ascetic Database Architectures
- A critique of CouchDB
- Sam Ruby's response to Dare's Critique
- Some informal CouchDb performance metrics - note the more recent benchmarks in the comments at the end.
- Jan Lehnardt's CouchDB-related blog
- Christopher Lenz talks about "join-like" queries in CouchDB
- CouchDB - A use case Kore Nordmann talks about how to implement a groups and permissions system with the CouchDB query system
- Reddit thread with random commentary on CouchDB
- CouchDB bulk insert's performance Shows a performance "floor" at 2.5 million records on a 1Gb machine, developers have not yet responded
- CouchDB in the browser, plus a bit about the Cloud vision for Couch
- Amazon SimpleDB and Eventual Consistency In a distributed world, reads need not always be up to date.