Wednesday, March 29, 2006

Raising the Bar for Record Managers

Most people that get involved with genealogy today use a family history application called a record manager to store and navigate lineage-linked genealogical information. There are many to choose from. Some of the more common are Personal Ancestral File (PAF), Family Tree Maker, Legacy Family Tree, AncestralQuest, RootsMagic and the Master Genealogist. There are also a range of online record managers like phpGedView, The Next Generation and PedigreeSoft which in some respects are more cumbersome than the traditional desktop applications but offer the advantage of ease of sharing and collaboration.

While these applications are very effective at organizing lineage-linked data, the user experience and complexity is on par with filing your taxes (my apologies to the makers of these products, I know most of them and hope they don’t take offense at this observation).

There have been a couple of interesting advances in the space over the last few years. Notably, the move toward online record managers (phpGedView,) and research guidance (Legacy 6) definitely show promise. So in the spirit of taking genealogy to common people, here is my top ten list of what it would take to raise the bar for record managers.

Top 10 Innovations Needed in Record Managers

  1. Living memory interview – I have personally done usability testing and observed ordinary people taking 30 minutes to an hour to figure out how to enter themselves and their parents into a record manager. When someone starts fresh in a record manager why isn’t one of the options to start a new file from what I know? This option would lead the user through a nice wizard-like living memory interview.

  2. Path to me – Once you browse a few generations it is impossible to tell which path leads back to me. Isn’t there a simple way to add this bit of information to the UI?

  3. Maps, maps, maps – Google Maps, Google Earth, open APIs, need I say more? Maybe not but I will. Users need current and historical maps for research. Seeing a historical map helps me feel connected to my ancestors. Overlaying data on maps is interesting. For example, migration patterns on maps, plotting the events of an ancestors life on a map, showing the overland trail they used to come west, showing the plat map of the town they lived in. Showing everyone with the first name of Deodat living in the US in 1850 census.

  4. Context – Users need lots of context to hold their interest, keep oriented, and to aid in research. Maps (as mentioned above), history (as mentioned in a previous article - record managers really need to integrate with WeRelate.org), historical texture (music of the time, clothes of the time, the price of gas) and anything else you can think of.

  5. Clue Pad/Scratch Pad – This ties back to the need for context. Users need a scratch pad to keep their clues on. The scratch pad needs to let them get back to the clue in context of their pedigree and the source information they were evaluating. It also needs to let them model simple things like: “The wife of John Smith is either Mary Jones or Mary Johnson,” and still keep them in context of the clues that led them to this theory.

  6. Automatic source citations – There has been a lot of dialog on this blog about self-citing Internet sources. Record managers must support this functionality moving forward.

  7. Improved match analysis – I’m not talking about the underlying algorithms (although they are an area of constant improvement). I’m talking about the user interface. How can a novice reliably and consistently make decisions about possible matches in 30 seconds or less? Here are some rough concepts we’ve played with to try and figure this out. They still need lots of work but show how match analysis can be much more than the status quo screens in most record managers.







  8. Interesting Charts – Lines and boxes were cool when dot matrix printers were the rage. Rounded corners had their day as well. Give me a chart that I can put on my wall and my kids, relatives, friends, anyone who walks in the door will notice and want to look at and not mistake it for an electrical diagram. Add historical texture to the charts, themed backgrounds, etc.

  9. Source citation wizard – While I strongly advocate self-citing sources for online content, the reality is we will need to deal with manual source citations for a long time. Let’s build a simple wizard, similar to CitationMachine.net to make it drop-dead simple to cite a source.

  10. Personal Research Assistant/Research guidance – Legacy 6 is moving in the right direction but their feature set works better for advanced genealogists and not ordinary people. Take a look at Grant Skousen’s Family Finder presentation from the BYU Family History Technology Workshop (should be posted in a week or so) to get a feel for how to deliver this for the uninitiated.



I'd be happy to engage in a deeper conversation of how to take record managers to the next level. Add your comments or e-mail me (lawyerdc@ldschurch.org).

Friday, March 24, 2006

Give Me Context

Last summer I spent some time conducting user testing of a software prototype designed to help ordinary people find their ancestors. There was an overview of this prototype and the results of our testing presented at the recent Family History Technology Workshop held at BYU a couple of weeks ago. Definitely worth checking out the slides and abstract information when it comes online. One of the things that really stood out in the testing was the amount of context a user required to be able to do genealogical research. Here is a quote from one of the users while looking at a census record which exemplifies the problem, "I need a map, a calculator and a really smart person at my side [to understand this census record]."

Here are a few pieces of context that users seem to always need in view in order to understand family history:

  • Maps (current and historical)

  • History (local, national, world)

  • Timeline

  • Pedigree


I've seen some interesting mashups trying to put one or more of these pieces of context together. For example, the following two sites have interesting mashups with google maps: www.linkr.org, www.linkr.org/temples. Of these important elements of context, history and historical maps are particularly hard to deliver dynamically in the context of family history. I believe it is due to the lack of an easily searchable and addressable collection of content. Wouldn't it be great if there was a public domain data set with a rich API for searching and distributing this type of content in context? Perhaps this is something that the guys at WeRelate.org can take on. Simply link the history and historical maps to their location authority, make the data elements addressable with sufficient granularity and provide an API. Oh yeah, and get a bunch of people to help populate the content.

Tuesday, March 21, 2006

Embedded Citation Examples

I have seen a few examples of embedded citations in the past few weeks. I would love to get a more complete list of good examples of sites that already have embedded citations. If you are aware of them please leave a comment with some links to exampls sites or send me an e-mail, lawyerdc@ldschurch.org.

One site that comes to mind is fact monster. They have a simple implementation with a link at the bottom of each page that opens a pop-up with the citation detail. Follow this link and look at the bottom of the page for the button.

A co-worker (thanks Steve) also sent me information about a NISO standard called OpenURL 1.0 (Z39.88-2004) which has come out of the library information community as a means to embed citation metadata in webpages. More detail can be found at the OpenURL COinS website.

Saturday, March 11, 2006

Genealogical Embedded Citation Standard 0.1 (Strawman)

Thanks to Michael Nelson and Derek Maude for taking a first whack at what the structure for a genealogical embedded citation standard might be. The following structure is intended to be compatible with GEDCOM and the upcoming FamilySearch Family Tree. It could easily be implemented in XML or a microformat. There are some outstanding questions that Michael and Derek pose which follow this quick strawman proposal. Please review and share your thoughts through the comments link below or by e-mailing me (lawyerdc@ldschurch.org).

Looking for more information on Genealogical Embedded Citations? See the March 3rd article, Self-citing Internet Sources

Nested list of genealogical embedded citation elements

citation
     url
     film-number
     sheet-number
     page-number
     frame-number
     call-number
     book-number
     image-number
     record-number
     batch-number
     serial-number
     date-recorded
     certainty
     comment
     source
         url
         title
         author
         abbreviation
         publication-info
         description
         time-period
         locality
         language
         film-number
         call-number
         batch-number
         comment
         repository
             name
             address
             phone
             email
             url
             comment

Just in case your browser doesn't like the way I've chosen to try and indent I've included a text description of the hierarchy at the bottom of the page.

Some questions to consider
1. Other formats have the ability to include the actual text. Is this necessary given the application?

2. Should there be a "provider" field for sources?

3. Should there be a "source-type" field for, say, stating the source is a census record?

4. Should this citation embedded in a page represent a citiation for that page or for the original record?

5. We included a description field and a comment field in the source. Is that necessary?

6. Should there be an "agency" field to include what organization originally created the record?

Text Description of Hierarchy
'Citation' is level 1 in the hierarchy. It contains the following level 2 elements: url, film-number, sheet-number, page-number, frame-number, call-number, book-number, image-number, record-number, batch-number, serial-number, date-recorded, certainty, comment, source.

The level 2 element 'source' contains the following level 3 elements: url, title, author, abbreviation, publication-info, description, time-period, locality, language, film-number, call-number, batch-number, comment, repository.

The level 3 element 'repository' contains the following level 4 elements: name, address, phone, email, url, comment.

Tuesday, March 07, 2006

Engaging Ordinary People

Before diving further into technology and feature threads perhaps a foundation for the need to engage ordinary people in family history should be presented..

Why Take Genealogy to the Common Person?
Imagine a world where average people have a strong understanding of their heritage; the legacy and values of their ancestors; and interactions with others are in the context of their relationships – this guy sitting across the table from me is my 3rd cousin. Understanding our heritage and relationships brings more meaning and stability to life.

Interestingly, a high percentage of adults (estimates vary from 65-85%) have a desire to know more about their ancestors. There seems to be an innate desire to understand our heritage yet most people that go down this path are quickly overwhelmed by the obstacles. The end result is that while the majority of adults in the world are interested in the domain of family history at some level, very few are able to contribute productively to the effort of mapping the family of man. How might the world be different if the efforts of all the people interested in their roots could be harnessed and channeled to contribute more meaningfully toward the overall challenge of mapping our common family tree?

Finding and Adding an Ancestor to a Pedigree is REALLY Hard


The reality of the world today is that for ordinary people finding and adding a name to their pedigree is very difficult. The challenges inherent to finding a name can be categorized into three areas: logistic and technical hurdles, varied life circumstances, and the reality that today’s tools don’t offer an engaging experience. These problem areas will be described in more detail in future posts. Solving these problems is a matter of answering the following questions:

  • How do we remove logistic, technology, knowledge, skill, & economic barriers?

  • How do we let ordinary people in a broad range of life circumstances contribute meaningfully?

  • How do we deliver an experience that is inviting, interesting, and fulfilling?


Engaging Ordinary People


The diagram above represents a conceptual model for how to harness the efforts of ordinary people. It involves offering a broad range of activities that are inter-related. Each activity is meaningful in and of itself while contributing to the overall effort of a common family tree. Activities toward the top of the list are likely to have broader participation than those at the bottom of the list. As long as an activity adds to the overall effort of building our common heritage, it belongs on the list. The Web 2.0 philosophies and related technologies play very nicely into such a model.

Please offer your feedback on the ‘Engagement Model’ by adding comments to this post. Do you think this model has the potential to harness the combined efforts of millions?

Friday, March 03, 2006

Self-citing Internet Sources

Nearly every genealogical tool or service provider has at one time or other thought of how much better the world would be if there were a way to automatically cite the source of genealogical data on the Internet. Then they quickly get discouraged at the idea as they think of the level of industry cooperation that would be required to make it happen.

There are some isolated cases of self-citing sources. For example, PAF Insight Manager (OhanaSoftware) is a very popular tool among LDS genealogists. The tool has the ability to automatically cite sources of information obtained from the FamilySearch website. I'm sure there are other similar implementations of which I'm not currently aware. At the present time there does not seem to be a defacto standard for how to do this.

I don't think this is rocket science. We don't need some Automatic Citation Encapsulation (ACE) protocol. We simply need to agree upon a way to embed citation details into a web page. The citation detail doesn't neccessarily need to be viewable to the user. Any time an Internet-based source is cited, the tool or service can simply look at the citation details on the web page and
appropriately cite the source for the user. In fact, the embedded citation block could be as simple as the following:

<Embedded Citation Block>
Super Cool Online Internet Source Which Links Me to Adam
Published on Some Day
Found on Some Page
Random Text
</Embedded Citation Block>


There are a lot of extremely smart people that will look at the above example and immediately see that additional tags could be added to impose some type of taxonomy on the citation. Even if the embedded citation block was never more sophisticated than what is proposed above, it would be substantially better than the ad hoc citations of the masses that don't have degrees in library science.

The efficiency of an embedded citation block may be be questioned. After all, if an application is just trying to get the data in the citation block, why impose the overhead of serving up the rest of the web page? Wouldn't it be more efficient to have a service or simply a URL which only delivers the citation details without this overhead? There is some validity to this concern and the the approaches of embedding a citation block in the web page and offering a specific citation service or URL for obtaining just the citation details aren't mutually exclusive. I prefer the embedded citation block approach. It seems easier to implement. Anyone that can create a web page can create an embedded citation block.

So here's the rally cry. Let's get together and define an embedded citation block standard. Let's keep it as simple as possible to start and see if we can't get a few of the more popular online content providers and tool vendors to implement it. Send me your comments. FamilySearch can easily implement this approach into the system we are building to deliver digitized microfilm. How would you like to see this implemented?