Wednesday, April 26, 2006

Family History Technology Workshop

This past March the 6th annual Family History Technology Workshop was held at Brigham Young University. I've been waiting for the presentations to be posted online so that I could offer some commentary about some of the topics discussed at the conference and reference the presentations. The slides for each of the presentations referenced below can found here.

Peter Norvig, Director of Research at Google
There was a lot of great content in Peter’s Key Note address Thursday morning. The thing that stuck out the most in my mind was the philosophy of recall versus precision when searching a data set. The basic philosophy Peter presented seemed to be that when you are dealing with sufficiently large quantities of data your recall (as a percentage of the total data set) can be low and precision extremely high.

Matthew Smith and Christophe Giraud-Carrier – Genealogical Implicit Affinity Network
Matthew presented their work on a Genealogical Implicit Affinity Network. This was totally cool. They took GEDCOM files and mined them for affinities between interesting data points, some examples offered were relationships, naming patterns and occupations. The results were presented as hyperbolic tree-like affinity diagrams. I think there is something in this that with some refinement would help ordinary people to have a greater interest and appreciation for their ancestors, not to mention the practical value of the data in research. I was so fascinated that with their presentation that a few weeks ago we visited them in their lab to have a follow on conversation. I hope to do a separate blog posting specific to this. If you’d like to find out more about their work visit the data mining lab website.

Shane Hathaway and the Touchstone Team – The Bit Mountain Research Project
Shane gave a great presentation on what it takes to build an 18 petabyte system that can be preserved long-term.18 petabytes is the projected storage capacity that will be required overtime for the new family history system the Church is developing. This is an area where the Church’s needs appear to be ahead of the industry’s capability. The paper submitted for this session (as with the other sessions) contains much more information than what could be presented in the 20 minute time slot. It describes the use of forward error correction in a distributed file system to deliver a ‘self-healing’ data store for applications.

Dallan Quass – Identifying Genealogical Content on the Web
Dallan is heading up a non-profit organization called the Foundation for On-Line Genealogy. He presented some research about their efforts to determine the best way to search for genealogical information on the web. Finding a more effective way to search the web for genealogical data is key to making genealogy more palatable for ordinary people. I’m excited by the work Dallan’s organization is doing. You can learn more about this and other efforts at WeRelate.org.

Grant Skousen – Family Finder Prototype
Grant presented an overview of a software prototype designed to help ordinary people (read: have never done genealogy) find their ancestors. The results of the research were promising. This project is one that I became involved with near the end. It has been extremely valuable in helping to shape thoughts around what it takes to help ordinary people to family history. I hope to do a future posting offering more insight into the research. For now, be sure to review this presentation.

Randy Wilson – High-Level View of a Source-Centric Genealogical Model
Randy presented a conceptual framework for a system that would change genealogy from an unbounded task to a finite effort. I believe Randy’s proposal is on target and that such a system must be created as the foundation of an effort to make genealogy work for the ordinary person. One of the primary points of frustration to those that contemplate finding their ancestors is knowing what has already been done and knowing where to start. Randy’s paper and slides are definitely worth reviewing.

5 comments:

Mark Butler said...

This is interesting stuff. On the other hand...

[Off topic]
Is the following "license" (see below) really necessary? Why is it the larger the organization the greater the level of superfluous (and generally unenforcable) legal paranoia? A shrink wrap license for a web site? Where no value exchange occurs at any point in the process?
What about fair use?

And more particularly why is the Church living in such irrational fear? Does BYU really think there is a commercial market for second hand PowerPoint presentations?

Normal people like to see their ideas propagated and do not freak out when someone prints a document in an environment tainted by evil dirty money grubbing (as if BYU employees were unpaid volunteers). It is sort of like the problems with the ivory tower concept of 'amateur' athletics presented so well in Chariots of Fire.

------

You may print material from this Web Site for personal or non-profit educational purposes only. All copies must include any copyright notice originally included with the material. All other uses requires the prior written permission of BYU CS Department.

Nothing contained in this Web Site should be construed as granting, by implication, estoppel, or otherwise, any license or right to use any Marks displayed on this Web Site without the express written permission of BYU CS Department or any third party that may own the Marks or content contained on this Web Site. Any unauthorized use of the Marks or any other material, except as authorized herein, is strictly prohibited.

For further information regarding the use of materials contained on this site, please contact the Computer Science Department at Brigham Young University, by telephone (801) 422-3027 or e-mail workshop@cs.byu.edu.

Disclaimer and Limitation of Liability

BYU CS DEPT. MAKES NO WARRANTY, EXPRESSED OR IMPLIED, INCLUDING THE WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE OF ANY MATERIAL DISPLAYED OR DISTRIBUTED THROUGH THIS Web Site, NOR REPRESENTS THAT ITS USE WOULD NOT INFRINGE PRIVATELY OWNED RIGHTS. WE DISCLAIM ALL WARRANTIES WITH REGARD TO THE INFORMATION PROVIDED. IN NO EVENT WILL WE BE LIABLE FOR ANY DAMAGES OR LOSSES WHATSOEVER RESULTING FROM OR CAUSED BY THE USE OR MISUSE OF THIS Web Site OR CONTENTS AVAILABLE THEREIN.

BYU CS DEPT. USES REASONABLE EFFORTS TO INCLUDE ACCURATE, COMPLETE AND CURRENT INFORMATION. WE DO NOT, HOWEVER, WARRANT THAT THE CONTENT HEREIN IS ACCURATE, COMPLETE, CURRENT, OR FREE OF TECHNICAL OR TYPOGRAPHICAL ERRORS. IT IS THE USER'S RESPONSIBILITY TO VERIFY ANY INFORMATION BEFORE RELYING ON IT.

ACCESS TO, AND USE OF, THIS Web Site AND THE CONTENT THEREOF IS AT THE RISK OF THE USER. IN CERTAIN INSTANCES WE HAVE PROVIDED LINKS TO OTHER WEB SITES SOLELY FOR YOUR CONVENIENCE. THESE SITES ARE NOT MAINTAINED OR CONTROLLED BY BYU CS DEPT., AND WE ARE NOT RESPONSIBLE FOR THEIR CONTENT. IT IS UP TO YOU TO TAKE PRECAUTIONS TO ENSURE THAT WHATEVER YOU SELECT FOR YOUR USE IS FREE OF SUCH ITEMS AS VIRUSES, BUGS, WORMS, CANCELBOTS, TROJAN HORSES AND OTHER ITEMS OF A HARMFUL OR DESTRUCTIVE NATURE.

Mark Butler said...

Or in other words isn't it a bit hypocritical to deny the fruits of tithing contributions to the people whose labors pay their salaries in the first place?

Dan Lawyer said...

Mark,

I appreciate your concerns about the license BYU attached but this is probably not the best forum venting them. Contact me via e-mail lawyerdc@ldschurch.org and I'll help get you in touch with the people who may be able to change this.

Ben Crowder said...

FYI, the links for Randy Wilson's presentation aren't working right now.

Nice summaries, by the way -- thanks. :)

Dan Lawyer said...

Ben,
Thanks for the catch on the links to Randy's paper and slides. I've fixed them. You may need to refresh the page.