History Database Project/C-base: ERD redux!

OK.  This is my third try to create a table structure for C-base.  Yesterday, I met with UVA Library database specialist Mary Ellen MacNeil, who manages a sophisticated FilemakerPro database for the Dolley Madison Papers.  She mentioned that “Notes” really seemed like the core table that should have access to every other table.  One sure way to make sure this could be supported is simply to put Notes at the center of the database and use join tables to connect it with every other table.  I’m not sure it’s right, but I think it’s the best way to move forward and continue developing C-base prior to the first meeting with the grad students on 1/30.  So, today, I try to learn more about using these join tables to open portals between tables to work with data.  



This image represents the ERD (Entity Relationship Diagram) behind C-base.  It describes the eleven tables (or entities) that house different kinds of data and visualizes their relationships to one another.  On its own, each table is like a single spreadsheet:  each row in the spreadsheet is a separate record in the database; each column is a field, or a general type of data.  (In the field “Bibliography,” for instance, you can create a bibliography citation record in Chicago style for The Ideological Origins of the American Revolution).  The power of a relational database is that we can put these different entities together to build searches, generate reports, and organize activities, mixing these data sets together in interesting ways.

Three key ideas inform this structure.

  1. Taking notes is at the center of historical research, and so Notes is the table that is at the center of C-base that all the other tables feed into. Notes are our annotations, comments, ideas, and transcriptions from Sources and Objects that we can tag using Keywords and arrange to support Projects.
  2. Different kinds of data might inform the notes but these must be organized on their own to get the most out of them and to keep our data “clean” (that is, free of repetitions and errors across the database). The Sources table includes bibliographic metadata about books, articles, archival documents, maps, and any other materials historians consult. Objects are the texts, images, PDFs, statistical tables, and documents we collect and store in digital form.  Projects are the scholarly products of this work with Sources, Objects, and Notes, such as chapters, books, articles, visualizations, and annotated bibliographies.  Agents are people–historical figures that we would like to keep track of in our notetaking. Keywords are the terms we use to tag our notes with themes and subjects that will be the way we search the database and organize our Notes to complete Projects.
  3. Join tables are the means by which we manage the relationships of these separate tables to Notes (and to one another).  As this ERD shows, each entity has a defined “one-to-many” relationship with at least one other entity.  Each of our primary tables (Agents, Projects, Objects, Sources, and Keywords) figure as the “one” or the parent in the one-to-many relationships with “many” or child tables that join to our key table, Notes.  We will use these join tables to make use of data from each of these primary tables in Notes.  Getting this relationship diagram right is the key to making a relational database work.  This structure provides a stable architecture on which we can combine data from each of these different tables/entities in illuminating ways.

History Database Project (C-base): Thinking through Sources

Our Filemaker Pro template is taking shape in preparation for an initial presentation on January 30.  I’m going to call it “C-base” (short for Corcoran Department of History Research Database).

This week I met with with Ivey Glendon, manager of the Metadata Analysis & Design Unit at the UVA Library.  We discussed the great promise of using C-base as a personalized research portal that would open onto the web to make use of state-of-the-field tools and databases.

We brainstormed ways to create a rich group of fields in the Sources table without overburdening users who might not want or need all of that articulated metadata associated with our books, articles, maps, and archival materials.

The tentative solution: two layouts for the Source table: one called Citations and the other called Sources.  “Citations” has just two fields visible: Footnote and Bibliography.  Each can hold your source citation in the correct Chicago-style format with the least amount of fuss.  If you use Zotero along with C-base, you can simply generate these two kinds of citations and paste them in.  If you don’t, library catalogs and othe databases usually generate formatted citations.  You can use the Library’s Virgo catalog, for instance, to look up a source like this one.  Then pull down the Item Action menu and select “cite,” which takes you to this page, from which you can cut and paste the citation and plop it in these fields.  It looks like the UVA Library is going to start generating these in Chicago style–so little to no editing necessary.

So that’s fine for a basic citation, but I’m thinking it’s worth a bit more time to have the full range metadata for every source within C-base, allowing for much more detailed analysis.  I’m interested in researching a book project called “The Political Economy of the American Revolution” that would involve working with hundreds of original pamphlets.  It will be worthwhile to be able to search through this corpus by distinct fields.  I could then list these pamphlets in order of publication date and visualize them on a timeline with other events, compare them by place of publication, and do some text analysis of the terms in the titles.  But I can’t do any of this if all of the metadata is lumped together in a single text field.  In the Sources table, I’ve recreated the full Zotero field list, which is displayed in the Sources layout.  My provisional plan is to leverage the power of Zotero‘s bibliographic tools for C-base to populate these fields.  Here’s the idea:  add bibliographic items to Zotero collections, output these to a CSV (spreadsheet) file, and then upload this metadata to C-base.  That’s a few extra steps, but in my tests it works well–especially because Filemaker can import data from files and match the column headers of a spreadsheet to its field names.  One could just forget about C-base and do all of this within Zotero without all of the extra work, but I’m just not satisfied with Zotero as a tool for note-taking and analysis.  Because it’s such a great bibliographic management system, however, I’m working to find the best ways to link it to C-base.  We are going to look into RefWorks and see what that has to offer.  This reminds me of David Weinberger’s book, Small Pieces Loosely Joined: A Unified Theory of the Web, which argues that the ideal tools for digital communication aggregate particular tools that each do their particular thing quite well.


History Database Project: Creating an “Entity Relationship Diagram”

To build the History Research Database in FileMaker, we need to define our “entities”–that is, the core sorts of data that will make up the database. Entities are also known as “tables.”  We might think of an entity as a single spreadsheet: each row in the spreadsheet is a separate record in the database; each column is a field, or a general type of data. The power of a relational database is that we can put these different entities together to build searches, generate reports, and organize activities.  Here’s my first crack at an Entity Relationship Diagram (ERD) with three possible fields in each table (there will be many more in fact).

Screen Shot 2017-12-11 at 10.13.19 PM


I propose four entities in the database:

Sources: information about books, articles, archival materials, etc.

Objects: research object we want to store and keep track of, such as: documents, maps, images, graphs, PDFs, statistical tables, website URLs, etc.

Notes: our annotations, comments, and transcriptions of sources and objects, tagged with subject headings and linked to writing and digital projects.

Projects: what we produce from our analysis of our sources, objects, and notes, such as chapters, books, articles, visualizations, annotated bibliographies, etc.

As this ERD shows, each entity has a defined “one-to-many” relationship with at least one other entity.  Getting this right is the key to making a relational database work.  The connectors indicate that every Source in the database can have many Objects (for example, I recently took a bunch of photos of maps in the wonderful Historical Atlas of Maine: one book, many maps), but not the other way around.  I can take many Notes from this book, my Source, but I’m going to limit myself to no more than one Source per Note.  As I build my Projects, I will probably associate many Notes and Objects with a particular Project, but not the other way around.  (This last set of relationships is a bit tricky, and I will possibly have to revise).

So, for now, this is my presumed ERD for the History Research Database.  Comments?  Suggestions?  I’ll keep working on this and post a full diagram with all of the fields as I make more progress.

History Database Project: Snap Judgment on “Heurist”

Heurist’s “research-driven data management system allows any confident researcher or data manager to design, create, manage, analyse and publish their own richly-structured database(s) within hours, through a simple web interface, without need of programmers or consultants.”

Pros: It’s sophisticated, powerful, supported by a user community, free, and offers online storage.  Lots of flexibility with importing data; syncs with Zotero; and built for humanities research. Lots of specialized tools and templates for particular research databases. Helpline is available for assistance.  Can set up multi-user database for group projects.

Cons: This is a complex tool with a high barrier to entry and lots of functions and interface items to sift through before one can create a usable history research database for general note taking. Can’t use it offline, and works slowly online.

Verdict: I could see learning this interface and building the History Research Database on it.  It seems like a really interesting project, especially with the many visualization tools built into it.  FileMaker offers layout and appearance control, while Heurist doesn’t.  There are too many specialized fields for my taste, and I think inputting data would take a long time.  FileMaker offers speed and elegance and especially a high degree of control over layouts. I think I can build a better notes template that all could use and modify with FileMaker. The advantage of not having to know how to manage entities and relational tables doesn’t really apply since I’m planning on building these myself anyway.


History Database Project: Reviewing Resources

As I get started developing a FileMaker Pro database for history research, I’m going through the “Learning FileMaker 16” tutorial on Lynda.com, a subscription service offered through my university library.  But I also want to investigate some other resources that might point the way forward.  I’ll be consulting:

DiRT (Digital Research Tools), a “registry of digital research tools for scholarly use.”

The Institute for Historical Research’s free online course on “Designing databases for historical research.”

Ansley T. Erickson’s “Historical Research and the Problem of Categories: Reflections on 10,000 Digital Notecards (Fall 2011 version).”  

The FileMaker template (“starter solution” in their lingo) called “Research Notes.”

When reviewing these resources, my process here is simple:  I ask, Is this the way I want to take notes, keep track of sources and objects, and plan projects?  Yes?  Great!  I’m done.  No?  Can I learn something by clarifying what doesn’t work for me?  Can I make use of this tool for a particular purpose or replicate the things about it that I like?  


Updated Florida Digital Atlas for The New Map of Empire released

Today I completed and published an updated digital atlas for chapter six, “Defining East Florida.”  Like all atlases for the book, they can be found at the New Map of Empire atlas page. The atlas features maps of peninsular Florida before and after the Peace of Paris, the 1763 treaty that granted this Spanish province to Great Britain.  It examines how the British tried to take command of the province to colonize it.  East Florida was distinctive in this process of taking possession of new territories because it was–by a long shot–the least understood place Britain acquired.  The first part of the atlas documents how mapmakers came to see the peninsula as a collection of islands instead of a part of the mainland.  Literary scholar Michelle Navakas (English, Miami University) first identified this geographic idea and discusses it in greater length in her new book, Liquid Landscape: Geography and Settlement at the Edge of Early America (University of Pennsylvania Press, 2017).

The British put Florida back together again, seeking to describe a province better suited for plantation development.  Surveyor General William De Brahm charted its coastline extensively in the 1760s, documenting river inlets that promised to become vectors for settlement.  He came to believe, however, that southern Florida was so volatile as a natural place that it would be difficult for Britain to colonize effectively in the short term.  His scientific views brought him into conflict with East Florida Governor James Grant, who was bent on his colony’s rapid development by metropolitan planters.