Friday, 30 January 2015

A brief update on my genealogy log tool

I have explored several different options for user authentication and authorisation in Rails, ranging from rolling my own user management system (easy enough, but potentially error-prone) to using a third party Gem like Devise or Sorcery. I'm still not 100% sure which way to go, but I think I might opt for Sorcery because that will make integrating with OAuth providers (to allow users to log into my app using their Twitter or Google credentials) easier than rolling my own system. Sorcery it is then. /nod

The next decision is an easy one - I will be using Heroku to deploy the app. That way I can get the app up and running for free before deciding if it is worth paying for larger deployment. I've been testing Heroku with a few sample apps and it seems easy enough to get things up and running. A number of large web apps also use Heroku, so they can handle the small amount of traffic I a likely to generate easily enough. /nod

Now I just have to decide on an initial feature set and start building the danged thing.

Saturday, 24 January 2015

No sources? No problem.

Today I spent a few hours visiting with my mother and aunt and the subject of family history came up. I haven't explored my mother's side of the family yet and had never heard many of their family stories, so it was quite an interesting visit. As I was leaving I mentioned to my aunt that I would be interested in seeing what information she has on her side of the family. According to the stories she told, several people on my maternal side have done varying amounts of research and there have been some conflicting details - one grand-aunt claims we have "Chinese royalty", another aunt claims we have no Chinese, a cousin says there is a Chinese connection... Whatever the truth, it will be fun exploring the different branches and trying to determine which of the conflicting tales is true.

On my way home I wondered how much of my this research will be documented and how thoroughly at that. Will my aunt have all the relevant documents, or just a couple of family trees compiled by others? To be honest, it really doesn't matter that much. Sure, it would be nice if the trees are fully sourced, with reasoned arguments for the various conclusions, but even if the is just a tree with no supporting evidence that in itself could be a useful tool. I will be able to use the trees as a guide for my own research, which will be a heck of a lot more than I have right now.

Sure, I won't actually be doing any of this research for some time to come - I have a lot on my plate as it is - but when the time comes I will gladly accept whatever information my aunt can provide, no matter how poorly documented. Even a roughly sketched map is better than no map at all when exploring unfamiliar territory. And who knows? I might be pleasantly surprised to discover a treasure trove of fully sourced evidence which will make things so much easier!

Friday, 23 January 2015

First steps to building a simple genealogy log tool

It has been too hot the past few days to do any actual coding (my home office is not air conditioned 8^( ) so I have been some thought into just what part of my genealogical toolset to tackle first. The easy decision was to re-visit my genealogy log tool.

A genealogy log tool is a relatively simple affair - simply a database tracking research efforts. Nothing particularly fancy, but I need to make sure I am recording enough relevant information about research sessions and I need to make sure the database is searchable so I can readily find results from previous sessions. I have a simple tool already that is little more than a spreadsheet, but I know I can do better.

Having decided on a research log, the next big decision is what platform do I target? I could make it a tablet app - a log would seem to be ideally suited to tablet devices, as they are easy to carry around to libraries and archives, so that is one option. Another option is to do a web-based app which would be accessible on any device anywhere you have internet connectivity.

If I choose the tablet route, I can do an iPad app relatively quickly, but if I want to also target Android that will require a little more effort. I don't have a great deal of experience with Android, so this might be a nice little project to brush up on my skills and through a couple of recent courses I have a few incentives to write Android apps - a programming contest I could enter and a free gift card offer from Amazon. That could be fun, but would mean I'd be writing two genealogy log apps - one for iOS and one for Android. Do I really want to double my workload?

The web-app route then seems to be the way to go - I'd only have one codebase and the app would work on any device, including iOS, Android and desktops. This would be an ideal opportunity to dust off my Ruby on Rails training at last.

Now that I have decided on a platform (Ruby on Rails) I just need to brush up my knowledge and look into some user authentication/authorisation tools to allow multiple users. So I have been spending the past two days reading up on Ruby Gems and getting totally overwhelmed with all the options available. I think my decision will be between Devise and sorcery. Both gems can work with OAuth so I can let users log in with Facebook or Twitter accounts, although that would mean I'd have to get a Facebook account so I can get an API token to use OAuth.

Once again Ancestry steps up to the plate...

Ancestry.com.au is throwing open their Australian records for the Australia Day long weekend. If you have Aussie ancestors, this weekend is the perfect time to get some free research in!

Thursday, 22 January 2015

First experience ordering UK wills online

Even though I had told myself "no research until my software is ready" I had to break this rule the other day. My father is interested in our family history, but he does not know (no care) how to do any research, so he is content to just read through the material put together by myself and my cousin. (My cousin put together a book tracing the BANNISTER line back to around 1732 - unfortunately with few sources.) Every now and then dad will pester me to show him some of the documents I have and occasionally he will ask for something I do not yet have. This happened a few days ago.

My great-great-grandfather, George Amos BANNISTER, came to Australia in 1856. We still do not know why he emigrated, but he came alone and settled in Melbourne, Victoria, Australia and within a few years was married and started a family. He had two children, Anne Rose (who I can only find two brief mentions of in other documents) and Amos Parker, my great-grandfather. When I was growing up, I heard stories about my great-grandfather being entitled to a guinea a week. The story was that he was an annuitant, receiving money from either the family business back in Stretford or a deceased rich relative. For some unknown and unexplained reason he refused to collect this money and that was a constant cause for family speculation - could we be rich?

When my cousin started researching our family tree, he turned up a number of unexpected details that caused a lot of consternation amongst the family, especially when his findings contradicted the "facts" of our family history. One detail my cousin seemed to dig up was that it was George Amos who was the annuitant, not Amos Parker, his son. Then I stuck my nose in and pointed out that George Amos's profession was listed as "accountant" on his marriage certificate, not "annuitant". Given the poor copy of the document we had and the family stories, my cousin had mis-read the profession - an honest mistake.

Later in my research however, I dug up some details that may lend some credence to the annuitant theory. I found a series of advertisements in the London Gazette, naming a large number of my relatives, including my gg-grandfather, his brothers and sisters and all their children in a case before the courts. These legal advertisements appear to be related to the will of a relative, Stephen RAINGILL. I found a copy of Stephen RAINGILL's will and he seems to have left each person named a annuity of £50 a year - close enough to a guinea a week to raise a few eyebrows! I still need to try and obtain any records of the court case to be sure what this was about, but it may yet prove that the family tale has some roots in truth after all.

I showed my father a copy of Stephen RAINGILL's will (and my transcription, because it is a bit hard to read) and that seems to have piqued his interest in family history and wills again. A few days ago dad asked if I had a copy of my ggg-grandfather's will - George Amos's father, Amos BANNISTER. I didn't have it because to the best of my knowledge UK wills have not been easily accessible online. However a few weeks ago the UK government launched their Find a Will service where you can order a digital copy of wills from 1858 to 1996 online for only £10.

My initial impression of this site was not favourable. To search for a will, you are asked for two pieces of information, surname and year of death. You cannot browse through entries, you cannot use wildcards, you must enter the surname exactly and the year of death. I already had the date of death for Amos BANNISTER (1861), so I entered BANNISTER and 1861 into the relevant fields and was presented with two pages from the Index of Wills and Administrations book for that year. Neither page had my ggg-grandfather. There is an option to browse forward or backward by year, so I scanned forward and back a few years but could not find Amos BANNISTER from Stretford at all. I did find his wife, Matilda who died in 1879. Disappointed but not deterred, dad agreed that Matilda's will could still prove useful, so I filled in the form to order her will. (Instead of a simple button to order the will directly, you must instead fill in date of probate, date of death, surname, first name, registry and optionally a folio number - most of the details are shown in a scanned image in the search results.)

A few days later I was in our local library and thought I would do a quick search on Ancestry to see if I could find any reference to Amos BANNISTER's will or probate there. Bingo! Ancestry returned the relevant page from the Index of Wills and Administrations - the very same index that the Find a Will service searches use! For some bizarre reason, the Find a Will search does not return all matches. I copied the details from Ancestry and entered them into the order form on Find a Will and have now ordered my ggg-grandfather's will.

I promptly fired of a long email to the feedback address for Find a Will outlining my frustration at the restrictive and incomplete search functionality. To my surprise I received a reply within a few hours thanking me for my feedback and explaining that they are working on improving the search and offering help should I have trouble locating a will in the future. While the experience of finding and ordering a will was far from pleasant, the quick response to my complaint/feedback was reassuring. Hopefully the functionality is going to be improved, but until such time I will be searching for details on Ancestry and using those details to order any wills.

Now I just have to sit tight and wait for the wills to arrive.

Tuesday, 20 January 2015

Why write my own software and what am I writing?

A while back I started developing a tool to help with my genealogical research. This single tool gradually evolved into a small collection of inter-related tools, each focussing on a specific task. (I firmly believe that smaller, more focussed tools are better than one monolithic, "Swiss-Army Knife" type tool that tries to do everything.) As I have been re-educating myself in the ways of better genealogical practices, I have realised that some of my initial assumptions in the design of these tools was wrong, specifically when it came to sources and proof - I was not properly capturing source citation data, nor was I being particularly diligent when it came to proving my conclusions/assumptions. Now that I have a better understanding of things such as source citations and genealogy standards (especially the Genealogical Proof Standard) I figure I should re-engineer these tools to incorporate my new-found knowledge.

There are many reasons behind my decision to write my own software. I have been unhappy with the state of the software available for the Mac - there are some good packages, but each one has certain issues that bug me. The best that I have tried so far (MacFamilyTree) is reasonably good, but its handling of locations and sources leave a bit to be desired IMHO. Another Mac program has great location management features, but surprisingly its mapping tools are not as advanced as MFT's. If I could cherry-pick pieces from several different programs I might be able to put together a truly awesome program, but there's still a major underlying problem for the way I want to be able to work.

A lot of the genealogy programs I have seen have been more focussed on creating a family tree and pretty charts and pay less attention to rigorous analysis and proof of the data entered by the user. There is also a lack of cutting edge research tools, that could assist a user to analyse their sources and find connections in their data.

I don't want to just record a set of dates for people in my family tree - I want to be able to find connections between people, properly study and analyse my source documents, capture detailed source citation information, apply the GPS, and more. So I figured the only way forward was to build my own tools to do as much of this as possible. To that end I have tools for genealogy logs, source transcription and analysis, and location management. To these I will be adding something geared more towards managing the GPS process, a reporting tool and of course a GEDCOM generator so I can export the resulting data for use in "traditional" genealogy software. I aim to make each tool small and focussed on the task at hand so that it is easy and fast to use.

I already have some prototypes of these tools, but I am now considering which platforms the tools should be targeted to. Some of the tools (the research log for example) lend themselves to tablet devices and smartphones, while others are more suited to desktop operation. Do I make the tools native apps or would a web-based solution be more appropriate? There are pros and cons for each platform choice, so I have a bit to think about.

I'd love to hear your feedback on the subject of genealogy research tools, as opposed to family tree builders. I'm interested in the process of finding the data to put in our trees, rather than the trees themselves. ;^)

Saturday, 17 January 2015

Not really doing a genealogy do-over

Okay, so I guess I am not really participating in the Genealogy Do-Over (GDO) as such. Yes, I have put aside all my previous research and I will be rebuilding my tree from (almost) scratch, but I started my do-over late last year and do I really want to start a do-over of my do-over? Prolly not. Instead, i will be using the GDO posts as a guide for my own thang and I will continue to plod along in my own way.

One of the realisations I made when I decided to start again was that I really didn't know as much as I thought I knew. I knew I didn't know very much and I felt that some of the things I was doing weren't quite right - I just didn't know how much I didn't know. So I set about educating myself on, if not the right way, a better way of doing my genealogy research. I have been unhappy with various aspects of the different genealogy programs I have used, especially when some things that just felt natural to me were so awkward to do in the software. After reading a few good books and trawling various blogs and other online resources I now feel more comfortable about my level of knowledge.

Am I ready to dive in and start researching again? Not quite. I am still uncomfortable with the software situation. (Here is where I went off on a long rant about genealogical software, which will become its own blog post...) Unfortunately I am somewhat of a procrastinator[*] - I know I really should get cracking and start writing my software, but I keep putting it off or getting distracted by shiny things elsewhere.

Anyway, I will be following the GDO material that various bloggers are posting, but I'll be doing my own do-over in my own plodding way. 8^)

[*] I am so much of a procrastinator that I am thinking of starting a procrastination blog or support group - I'll get started on that tomorrow... ;^)

Thursday, 15 January 2015

The reason why I stopped using FamilySearch's Family Tree

This morning I received an email notification from FamilySearch alerting me to the fact that someone had modified a person in my family tree. I had all but forgotten about this tree on FamilySearch as I had stopped using that functionality a long time ago and this email reminded me of why I stopped using it.

Before I get too far into this, I will admit to not having used the Family Tree properly when I first started using it, but after a short while I realised my mistakes and started adding people and sources and built up a small tree of my ancestors. It was a nice tool and the fact that FamilySearch was integrated into my genealogy tool (MacFamilyTree) made it quite attractive to use. However the functionality to merge people has some major issues IMHO. It is not always clear when or how to merge people and if a decision to merge is taken, there is a lot of collateral work that also needs to be done to ensure that the merged entity retains its integrity.

The changes I was notified of this morning highlighted the integrity problems to me. I now have in my FamilySearch Family Tree, an ancestor (let's call her EB) who has to sets of identically named parents and two sets of identically named siblings. The user who modified EB simply added the parents from their tree to EB without first checking EB's record. A quick check would have shown that EB already had parents attached, and via the parents a complete list of siblings - 11 in total.

Now when I look at my FS Family Tree I see EB with two sets of parents and 22 siblings! I have neither the time, nor the inclination to go through each of the pairs of parents and siblings to do a proper merge of the family, but it does need to be done. As such, this tree is broken, not just for me, but for anyone who views it. I can flag one set of parents as "preferred", but I doubt that will really do anything useful.

One big problem with FS Family Trees is that anyone can edit any person with little or no control over what edits are made. Randomly adding a person to a tree by creating a parent-child or spousal relationship without checking to see if any other details need to be updated or associated people need to be merged is just creating a mess. IMHO the only way to add a new person into your tree is to carefully and methodically go through each connected person and check to see if they need to be merged with another record. This is a slow and painstaking task, but if everyone did this, the resultant trees would be much more useful to all involved. Without this careful merging, what you end up with is two trees joined by a single person, with a whole mess of duplicates on either side.

So for now I am just going to turn off the FS notifications and pretend that this functionality does not exist. I will continue to use FamilySearch for research but will not be linking people in my desktop tree to FamilySearch when my rebuild finally gets underway.

Saturday, 10 January 2015

My file naming scheme

Before I (finally) dive back into my research, I'll take a brief moment to outline the file naming scheme I have chosen for my digital files. In previous posts I have outlined most of the folder structure I am using but the real magic in my system is in the file names I use and careful use of symbolic links, aka file aliases. This will be a long post, but I hope you will stick with it.

I might give a more detailed recap of the overall folder hierarchy in a future post, however a brief summary is in order for now. At the top level is a single folder for the family I am researching. I give this folder a descriptive name, which matches the name I give to the database in my genealogical software. In my case this is Bannisters of Stretford. Under this folder are a series of sub-folders for Books, People, Pictures, Places, and Sources.

Books contains, well, books of course, but only books that are not used as sources in my research. This is more a place for helpful reference books, histories of the town my family came from, auto-generated reports etc.

People is a special folder that I will come back to shortly.

Pictures contains photos, mostly of people in the family, but no images of source documents - source images belong elsewhere.

Places is another special folder like People and will be explained below.
Sources contains all my source documents, broken down by type of source such as BMD, Books, Census, Electoral Rolls, etc. All images of sources are appropriately named using the scheme I will define below and then placed in one of the subfolders under this Source folder. The key is that every source has one and only one location for the original document which can be (hopefully) easily discerned by the type of document it is.

Now, onto the actual naming scheme...

For sources relating to a specific person, the basic file naming scheme is as follows:
<Date> <Person Name> - <Type of document>, <Page number>.<file type>
  • <Date> is the date of the source document (or a date range if applicable) in the format YYYY, YYYY-MM or YYYY-MM-DD as applicable
  • <Person Name> should be self explanatory, but to be clear this should be the name of the person (or persons) as given in the document - not their nickname or a name they may be known by at a future date
  • <Type of document> will be something like "Birth Certificate", "Marriage Certificate", etc
  • <Page number> is optional, but should indicate the page number or range of page numbers if this image is just part of a larger source.
  • <File type> is just the standard file extension, like tiff, png, jpg, pdf, etc.
For other types of sources, such as census pages, electoral rolls, parish registers or newspapers:
<Date> <Source name>, <Page number>.<file type>
  • <Date> is the date of the source document (or a date range if applicable) in the format YYYY, YYYY-MM or YYYY-MM-DD as applicable
  • <Source Name> is a descriptive name for this source
  • <Page number> is optional, but should indicate the page number or range of page numbers if this image is just part of a larger source.
  • <File type> is just the standard file extension, like tiffpngjpg, pdf, etc.
It is vitally important that the first part of the filename be a date in the format YYYY, YYYY-MM or YYYY-MM-DD as applicable. If the year is not known, then _unknown_ is used instead. Using YYYY-MM-DD for dates means the filesystem will automatically sort your documents by date, with unknown dates filtering to the top of the list.

Some sample document names would be:
1968-11-23 Amos Ross BANNISTER - Birth Certificate.png
1956-04-13 Kennett John BANNISTER and Colleen Dawn WALTERS - Marriage Certificate.png
2014-03-05 Narooma News, p34.tiff
Admittedly some of the filenames can get a little long using this scheme, so you may want to use abbreviations, but I like the verbosity - it makes it quite clear what each document contains.

It might seem strange lumping all the sources into one set of folders, mixing up sources for different people in the one place, but here is where the magic comes into play. The People folder does not contain any actual files, but this is where you will come to find all the files relating to a particular person. How can this happen? The answer is a nifty feature of modern filesystems called symbolic links. A symbolic link allows you to give a file multiple names and even make it appear in multiple folders and my file naming scheme uses symbolic links to avoid duplication of files.

Some sources naturally pertain to multiple people. For example, a birth certificate is not only a relevant source for the child who was born, but it can also be a source for the father and mother of the child and possibly even siblings who may be named in the document. A marriage certificate is obviously a source for bother the bride and the groom and might also contain details of the parents of one or both. By using symbolic links I can store a marriage certificate in the Sources > BMD > Marriage Certificates folder and then create a link to this document under the folders for each person named in the document. The actual file is located in only one place on disk, but it can be accessed from several alternate locations.

Even better, when creating a link to a file it is possible to rename the linked file. This means while the original file might be called "1956-04-13 Kennett John BANNISTER and Colleen Dawn WALTERS - Marriage Certificate.png", under the individual folders for Kennett John and Colleen Dawn the link might be renamed to simply "1956-04-13 Marriage Certificate.png".

So what does my People folder actually look like? Under the People folder I have a subfolder for each surname in my family tree. Within each surname folder I create folders for each person born with that name. Now it might be possible that at some point in my family tree I have two people sharing the same surname who are not directly related - this doesn't matter, they would both be placed under the same surname folder. These folders are not family groups, they are just names, just like the names in a phone directory.

In each surname folder I create a subfolder for each person with that surname. These folder have a specific naming convention:
<First name(s)> <SURNAME> (<born> - <died>)
(I tossed up whether to include the surname in the person's folder name, as it can be inferred from the parent folder, but decided to leave it in. You might choose otherwise.)

The <born> and <died> fields are the year only. If more than one person with the same surname shares the same birth and death years you could include the month and/or day to further differentiate them, but for most people the birth and death years should be enough to identify them. If the person is still living, then I use "living" for <died>, and if one or both of the dates is (as yet) unknown I simply use "unknown" for the year.

So some sample folder names would be:
Amos BANNISTER (unknown - 1782)
Amos Parker BANNISTER (1868 - 1954)
Amos Parker BANNISTER (1907 - 1979)
Amos Ross BANNISTER (1968 - living)
Here's where the symbolic links come into play. Within each person's individual folder, I create a symbolic link to all the sources pertinent to that person. I can rename the link to a more user-friendly name if I choose, or I could leave the link with the original filename. Now when I want to find a particular document relating to a person, I can just drill down to that person's individual folder and I will see every source document for that person, sorted by date, and I can see at a glance their entire timeline. opening the links will open the original document, no matter where in the source folder hierarchy the document actually lives.

So what does this look like? Here's a snippet of my family tree (which I am rebuilding) showing my father's folder as it currently looks. (Used with his permission.)

You can see how each source document appears in chronological order. The little arrow at the bottom left of each document icon indicates that it is a symbolic link and the original document lives somewhere else on disk. (When I get around to sorting through my photos, I will also be creating symbolic links to photos in each person's folder, so my father's folder will also contain links to all the photos he appears in.)

With this folder structure, file naming scheme and using symbolic links, I now have an easy way to find any document relating to an individual I want so long as I know their given name(s) and surname at birth. Of course there is the issue of what to do with people who change their name, whether through marriage, adoption or some other means? This is easily solved with symbolic links.

Let's look at marriage. Generally when two people get married, the female will take her husband's surname as her own. In this case I create a symbolic link from the female's individual folder (which lives under her maiden name's folder) in the folder for her new surname. So my mother's individual folder is "Colleen Dawn WALTERS (1936 - living)" which is located under the WALTERS surname folder, and I created a link to this folder under the BANNISTER folder. In the process I renamed the linked folder to "Colleen Dawn (WALTERS) BANNISTER (1936 - living)" to indicate her maiden name when I view her folder under the BANNISTER folder, but her original individual folder under WALTERS remains unchanged. So my People folder looks like this:

Now I can access my mother's source documents by drilling down through her maiden name (WALTERS) or her married name (BANNISTER). Whichever path I take, I will see all the same documents, covering her entire life. If she were to remarry, I would create a new link under her new married surname and she would appear in three places, all with the same documents accessible.

There is one special folder under People called "zz Unknown". The zz Unknown folder  is a temporary holding cell for people whose surname I have not yet identified. For example, one of my ancestors, Amos Bannister was born in 1771 and his parents were listed as Amos Bannister and Catherine. I have not yet found any other information about Catherine, so I do not know her maiden name, so she gets a folder in the zz Unknown folder until I find more information. Once I do identify who Catherine was, I would create a new folder for her surname, then move her individual folder into the correct location.

The Places folder uses symbolic links in a similar way, to group sources relating to a place, such as maps, pictures, histories, etc. The subfolder hierarchy for Places is broken down by Country, State, County, Town and optionally folders for individual sites, such as churches or houses. I haven't actually created many folders in my Places yet, but as a rough guide the folder hierarchy might look like this:

  • Places
    • Australia
      • Victoria
        • Hotham
        • West Brunswick
          • 33 Burnell Street
    • UK
      • England
        • Lancashire
          • Stretford
            • St Matthews
            • Edge House
      • Ireland
        • Cavan
          • Bailieborough
      • Scotland
        • Moray
          • Elgin


So that's an overview of my file and folder naming scheme. it might sound complex, but I find it to be quite a powerful system. When I get a new source document, I immediately name it appropriately and place it in the correct subfolder under the Sources folder. As I add a source to a person (or place) I create a link to the source document under the relevant person's (or place's) individual folder. I find if I do this as I get sources and add them to people, the folders stay in sync with my family tree and I can quickly and easily locate any document for any person (or place) without any problems.

Previously I had documents scattered all over the place and it was almost impossible to find what I wanted when I wanted it. A side effect of my previous "system" was massive duplication of files. I had multiple copies of the same file stored under different people and when I was using family groups as the core of my folder structure I had no end of problems working out where people belonged when they married, divorced and remarried. Now every source document belongs in only one place and every individual has only one folder in a clearly defined place. By using symbolic links I can access these documents and people in various locations, but when I create a new person or a new document there is only ever one place that person/document could be created. This has helped me keep my files in order to no end.

I would love to hear your feedback on this system. Do you have any questions about how it works or where certain document would be stored? Can you see any problems with what I have described? Does it make sense to you or not? What file storage/naming scheme do you use for your digital files?