Do While / Switch Case / If then Else if which way for this problem?


#1

I have a plain text file to read in and then process into MySQL tables.

This is a Gedcom file (none of the available parsers I have found cover what I need). The file comprises too many rows to read into an array as this was my first thought. So what I want to do is to step through each person record and extract the data, the issue is that each data item in the persons record can come in a different order and there are data items that I don’t need.

See this as an example I need everything other than the italicised rows:

0 @I802@ INDI
1 NAME Alice Lily /Lefever/
2 GIVN Alice Lily
2 SURN Lefever
1 SEX F
1 BIRT
2 DATE ABT 1882
2 PLAC Shoreditch, London
1 NOTE From FreeBMD birth Dec Q 1882 Shoreditch Vol 1c page 36
1 _UID 1A5330EF2C17F64DB8BD1FE7E2FD11017E68
1 CHAN
2 DATE 29 JAN 2014
3 TIME 08:59:19
0 @I803@ INDI
1 NAME Harriet Matilda /Lefever/
2 GIVN Harriet Matilda
2 SURN Lefever
1 SEX F
1 BIRT
2 PLAC Bethnal Green, London
2 DATE ABT 1882
2 ADDR 22 High Street
1 NOTE From FreeBMD birth Dec Q 1882 Bethnal Green Vol 1c page 179
1 _UID 958278B7ECB57248BED81D971EDD4E447804
1 DEAT
2 DATE ABT 1882
2 PLAC Shoreditch, London
2 ADDR 47 Bow Road
1 CHAN
2 DATE 29 JAN 2014
3 TIME 10:46:59

I have a while loop that loops through until the end of the file.

Which is the best way to read all the individual rows in for one individual and then process them. I have found that if then else if does not get what I need due to the out of order records concerning date or place.

An individuals record always starts with a 0
The type of event for an individual always starts with a 1 (however other rows I am not interested can also start with a 1).
The details that I require from an event always start with a 2 (however other rows I am not interested can also start with a 2).

I am not asking for anyone to code this for me but to give me some suggestions as to the best way of coding this so I don’t waste a lot of time.


#2

The loop type doesn’t matter, though I mostly prefer to use foreach as I think it makes naming things properly easier

The biggest challenge as I see it is figuring out which date fields you want. It seems you do not want the field that actually holds a date - but the one with ABT 1882…?

After you have defined what data you are after and how to differ it from unwanted data it should be pretty easy to write a simple parser

Pseudo

wantedColumns = ["NAME", "GIVN", etc...];
record = [];
For each rows as row
    if row starts with 0 {
        // found a new record
        // save the last record we found (if any)
        saveRecord(record)
        // then start a new one
        record = ["id" => idFromString];
        //skip to next row;
        break;
    }
 
    if char 3 to 6 in string is in array of wanted values
        record[char 3 to 6] = trim ( char 7 to end )

Should do the trick


#3

Thanks I will give that a go.

I don’t want

1 CHAN
2 DATE 29 JAN 2014

This means the date the record was last changed, What I want is the ABT 1882 as this was the date of a historical event.