Friday, February 22, 2008

Source Centric Prototype

I've been spending a lot of time thinking about concepts of source-centric genealogy. I've written in the past about how most family history applications tend to put the emphasis on the conclusions drawn and not the sources found. Of course they allow you to add a source citation and perhaps even include an image, but the heart of the experience (screen realestate, prominence, features, etc.) is about the conclusions. The paradox is that the heart of the matter really is the evidence.

We've built a prototype to try and make evidence more central to the experience. It has been interesting to see the reactions of different customer segments to the prototype. The professional genealogists love having the evidence right there. Those who have never done genealogy don't really care about the subtleties of evidence but are thrilled to see images of original documents about their ancestors.

One of the really interesting things in the reactions to the prototype was how people felt about different sources of information. You see, the prototype works against the new FamilySearch. The new FamilySearch contains basically all of the lineage-linked data sets in the possession of FamilySearch. This includes Ancestral File, Pedigree Resource File, International Genealogical Index, and many other lesser-known data sets. As we built the new FamilySearch we combined all of these data sets into one system. Where we had extremely high confidence that people were the same, we combined the records together into one person (OK, actually the computers did this for us). We did this in a non-destructive way. You could think of it like putting all of the people that were the same into one folder. You can pull out the individual people or you can look at the whole folder as one person.

Anyway, the reason for all of that background is the prototype basically starts with the new FamilySearch and displays the information as conclusions with sources rather than people combined together. When we showed this to people (especially those more experienced with genealogy) they really bristled at the trees but loved the sources. Here's some screen shots.

The nice thing about the way this is working out is that even though this kind of displays things separated out again, the new FamilySearch effort creates a mapping for us between people and the system and all of the other information in the system. That means when you start out, you already have a tree pieced together and you can see the sources and trees that were used to do it.

This is kind of a ramble but if it sparks any thoughts I'd love to hear them.


Venitar said...

You are on the right track! The "sources" in new FS now are all but useless if not an interferance. In working with others, I find that the biggest problem with sources is that the majority of the people who will be working with new FS don't understand why sources are neccessary and think that something like 'Aunt Jesse said so' is as good a source as any. So there is a lot of education that needs to be done among church members.

Here's another challenge for you: sources, once created, whether they be a digital document or in a template, should ideally be available for linking to more than one individual and/or event in person's database.

Keep at it! You have my moral support, for what it's worth.


Daniel Longmore said...

That is really great stuff. Any ideas when it will be available?

Dan Lawyer said...

Unfortunately I can't comment on when something like this might be available. The current effort is really just a prototype to help us understand the dynamics of the concept.

We do intend to make these sources easy to link to multiple people in the system. One of the great things about this approach is that if a distant cousin adds a source you'll be able to see it.

Venitar said...

Brilliant! Keep up the good work.


damarisfish said...

Dan, all the records FS has does not include the 1880/1 censuses, though, right? When I have seen the census records in a combining people exercise for my family in nFS, it is because it was a user-submitted event associated with a person's record? When we submit a GEDCOM, whatever has been added (in PAF) as an Optional New Event or Attribute, shows up in the individual's record in PRF -> nFS.
I sure have appreciated the Source List in PAF - being able to enter a source one time, then select it, get the exactly same wording every time, and finish the citation for each particular use. Thanks for Memorize Citation, too! Maybe if we had the whole Family History Library Catalog available as the nFS Source List..., ha, ha? Thanks to Vern Taylor, Stockton CA FHC, for a nifty GEDCOM of 101 basic sources ( The real trick, as I see it, is to get a standardized listing for sources so that they can be searched & sorted, to find the matches. The website has a great idea: basically that if you can identify your ancestors in the 1880/1 censuses (they suggest using for free :-), and can pinpoint the Film Roll, page #, etc (see the citation), then you post your guys' names & that reference. Then when someone else comes along with the same family and same references, it makes sense to let you send a contact post to eachother. If you have the same ancestors, and you didn't know about eachother, then you are Lost Cousins now found! Fun, fun!
Here is something that has been an aggravation to me with sources. On when we download a record from the IGI, it essentially creates a new source record in PAF when it's imported. They all look like the same source (IGI, etc.), but they will not combine with eachother, because the Today's date (of download) is part of the source in such a way that they all look like different bodies of work. I mad a "dummy" basic IGI source to use instead, but it is a nuisance. Not complaining really, but, maybe when you guys have nothing else to do....
I am looking forward to 1)officially getting access to nFS, and 2) someday being able to download info from nFS, not just copying & pasting.
Thanks to the heads up to your blog on the FHCNET Yahoo Group.
Damaris Fish, Central Point, Oregon

Blake Christensen said...

Now, if you can figure a way to represent a source where you can't figure out which of three John Goldens it belongs to, I would be very happy. I like what is going on in new family search and your source centric prototype, but they still don't support the early research phase of genealogy.

Shoebox Genealogy said...

Dan, I'm absolutely loving what you're doing and the direction you're taking with these products. I'm a pretty outspoken (constructive) critic of nFS, and see lots of things that need to be changed, fixed, or updated before nFS should be taken to the masses. I still think it's being released too early, but I'm starting to lose that bad taste in my mouth, knowing there's people like you pushing for the real saving grace of nFS: Sources.

I am especially encouraged by the way the Life Browser displays the term "evidence" beside each event, encouraging people who give informaiton to cite their sources.

Repeating some of what has been mentioned by others, I think that there are several important considerations when developing the ability to add sources:

1 - Sources are not always positive evidence. There should be several classifications for evidence - proved sources, disproved sources, ambiguous sources, and conflicting sources.

2- Adding the ability to comment on individual sources (like we comment on individual blog entries) would allow focused discussion about interpreting handwriting, the validity of the information, implications of the information, and how to handle conflicting information.

3- Eliminating the duplicate IGI sources in nFS. Printing out an FGS for a family often results in 2 pages of family info, and 5 pages of IGI sources. It's impossible to use! Putting duplicate IGI entry sources on a back-page, or combining them all and making the source "International Genealogical Index" a link to the full list of sources would be just fine. Honestly, nobody is using all those IGI sources.

4- Extracted records need to be distinguished differently than a user-submitted IGI record. They are infinately more important, are should be categorized as so, with corresponding batch and film numbers displayed.

5- A single source should be able to refer to multiple people, as described by others.

6- Sources, when applied to combined individuals, need to be able to stay with only one person when people are uncombined. Take a look at Heber C Kimball, who is currently merged with about 20 different men. Adding a source for one will not apply to the others, and when un-combined, there needs to be a way to determine who the sources belong to that involves the un-combiner AND the source submitter, letting one suggestion take precedence if the other individual refuses to reply.

Lots of other suggestions, but I think that will do for today. Keep up the good work, and please implement improved source features quickly!