History Database Project/C-base: ERD redux!

OK.  This is my third try to create a table structure for C-base.  Yesterday, I met with UVA Library database specialist Mary Ellen MacNeil, who manages a sophisticated FilemakerPro database for the Dolley Madison Papers.  She mentioned that “Notes” really seemed like the core table that should have access to every other table.  One sure way to make sure this could be supported is simply to put Notes at the center of the database and use join tables to connect it with every other table.  I’m not sure it’s right, but I think it’s the best way to move forward and continue developing C-base prior to the first meeting with the grad students on 1/30.  So, today, I try to learn more about using these join tables to open portals between tables to work with data.  



This image represents the ERD (Entity Relationship Diagram) behind C-base.  It describes the eleven tables (or entities) that house different kinds of data and visualizes their relationships to one another.  On its own, each table is like a single spreadsheet:  each row in the spreadsheet is a separate record in the database; each column is a field, or a general type of data.  (In the field “Bibliography,” for instance, you can create a bibliography citation record in Chicago style for The Ideological Origins of the American Revolution).  The power of a relational database is that we can put these different entities together to build searches, generate reports, and organize activities, mixing these data sets together in interesting ways.

Three key ideas inform this structure.

  1. Taking notes is at the center of historical research, and so Notes is the table that is at the center of C-base that all the other tables feed into. Notes are our annotations, comments, ideas, and transcriptions from Sources and Objects that we can tag using Keywords and arrange to support Projects.
  2. Different kinds of data might inform the notes but these must be organized on their own to get the most out of them and to keep our data “clean” (that is, free of repetitions and errors across the database). The Sources table includes bibliographic metadata about books, articles, archival documents, maps, and any other materials historians consult. Objects are the texts, images, PDFs, statistical tables, and documents we collect and store in digital form.  Projects are the scholarly products of this work with Sources, Objects, and Notes, such as chapters, books, articles, visualizations, and annotated bibliographies.  Agents are people–historical figures that we would like to keep track of in our notetaking. Keywords are the terms we use to tag our notes with themes and subjects that will be the way we search the database and organize our Notes to complete Projects.
  3. Join tables are the means by which we manage the relationships of these separate tables to Notes (and to one another).  As this ERD shows, each entity has a defined “one-to-many” relationship with at least one other entity.  Each of our primary tables (Agents, Projects, Objects, Sources, and Keywords) figure as the “one” or the parent in the one-to-many relationships with “many” or child tables that join to our key table, Notes.  We will use these join tables to make use of data from each of these primary tables in Notes.  Getting this relationship diagram right is the key to making a relational database work.  This structure provides a stable architecture on which we can combine data from each of these different tables/entities in illuminating ways.