Wednesday, April 26, 2006

Family History Technology Workshop

This past March the 6th annual Family History Technology Workshop was held at Brigham Young University. I've been waiting for the presentations to be posted online so that I could offer some commentary about some of the topics discussed at the conference and reference the presentations. The slides for each of the presentations referenced below can found here.

Peter Norvig, Director of Research at Google
There was a lot of great content in Peter’s Key Note address Thursday morning. The thing that stuck out the most in my mind was the philosophy of recall versus precision when searching a data set. The basic philosophy Peter presented seemed to be that when you are dealing with sufficiently large quantities of data your recall (as a percentage of the total data set) can be low and precision extremely high.

Matthew Smith and Christophe Giraud-Carrier – Genealogical Implicit Affinity Network
Matthew presented their work on a Genealogical Implicit Affinity Network. This was totally cool. They took GEDCOM files and mined them for affinities between interesting data points, some examples offered were relationships, naming patterns and occupations. The results were presented as hyperbolic tree-like affinity diagrams. I think there is something in this that with some refinement would help ordinary people to have a greater interest and appreciation for their ancestors, not to mention the practical value of the data in research. I was so fascinated that with their presentation that a few weeks ago we visited them in their lab to have a follow on conversation. I hope to do a separate blog posting specific to this. If you’d like to find out more about their work visit the data mining lab website.

Shane Hathaway and the Touchstone Team – The Bit Mountain Research Project
Shane gave a great presentation on what it takes to build an 18 petabyte system that can be preserved long-term.18 petabytes is the projected storage capacity that will be required overtime for the new family history system the Church is developing. This is an area where the Church’s needs appear to be ahead of the industry’s capability. The paper submitted for this session (as with the other sessions) contains much more information than what could be presented in the 20 minute time slot. It describes the use of forward error correction in a distributed file system to deliver a ‘self-healing’ data store for applications.

Dallan Quass – Identifying Genealogical Content on the Web
Dallan is heading up a non-profit organization called the Foundation for On-Line Genealogy. He presented some research about their efforts to determine the best way to search for genealogical information on the web. Finding a more effective way to search the web for genealogical data is key to making genealogy more palatable for ordinary people. I’m excited by the work Dallan’s organization is doing. You can learn more about this and other efforts at WeRelate.org.

Grant Skousen – Family Finder Prototype
Grant presented an overview of a software prototype designed to help ordinary people (read: have never done genealogy) find their ancestors. The results of the research were promising. This project is one that I became involved with near the end. It has been extremely valuable in helping to shape thoughts around what it takes to help ordinary people to family history. I hope to do a future posting offering more insight into the research. For now, be sure to review this presentation.

Randy Wilson – High-Level View of a Source-Centric Genealogical Model
Randy presented a conceptual framework for a system that would change genealogy from an unbounded task to a finite effort. I believe Randy’s proposal is on target and that such a system must be created as the foundation of an effort to make genealogy work for the ordinary person. One of the primary points of frustration to those that contemplate finding their ancestors is knowing what has already been done and knowing where to start. Randy’s paper and slides are definitely worth reviewing.

3 comments:

Dan Lawyer said...

Mark,

I appreciate your concerns about the license BYU attached but this is probably not the best forum venting them. Contact me via e-mail lawyerdc@ldschurch.org and I'll help get you in touch with the people who may be able to change this.

Anonymous said...

FYI, the links for Randy Wilson's presentation aren't working right now.

Nice summaries, by the way -- thanks. :)

Dan Lawyer said...

Ben,
Thanks for the catch on the links to Randy's paper and slides. I've fixed them. You may need to refresh the page.