Raider of the lost MARC: retrieving deleted/replaced Voyager records

Backstage Library Works: Cataloging

I’ve never worked with Backstage Library Works but always enjoy their postcards.

Have you ever overlaid (or just deleted) the wrong record in Voyager? If you have server access, you can retrieve the old copy of such a record with one command.

The Problem

Voyager lacks intuitive version control for its records. For a given record, the Cataloging module easily provides a list of who made which categories of edits and when, but with no record of which lines were actually changed. No information is readily available for deleted records. This is particularly frustrating when you have accidentally overlaid the wrong record and want to restore it to its previous state. Next-generation ILS’s coming on the market advertise version control, but it is sadly lacking in Voyager and many older systems. Fortunately, Voyager does retain records that have been replaced or deleted, and you can retrieve them if you have some server access and the ability to process MARC.

Tools: Cygwin

If you run Windows, you can use a lot of powerful unix-like tools with Cygwin. When you install Cygwin, there are many choices of which packages to include. I typically select at least the following:

  • openssl/openssh – standard tools for secure communication between computers
  • curl, wget – command line tools for capturing web pages
  • vim – my text editor of choice
  • perl, python, ruby – popular scripting languages
  • git, mercurial, subversion – version control systems
  • ImageMagick, pdftk, antiword – tools for manipulating documents
  • zip, unzip – this functionality is in Windows already, but it’s nice to have from the command line.

Once Cygwin is installed, you can access these tools using the Cygwin Bash Shell, a command prompt. Files are available under the /cygdrive directory tree; for example, the directory C:\local can be accessed as /cygdrive/c/local. If there are multiple commands you’d like to run in a row, you can save them as a bash script for re-use.

Tools: MarcEdit

If you have done any batch editing of MARC records recently, you are no doubt familiar with Terry Reese’s MarcEdit. MarcEdit converts between binary MARC and mnemonic text MARC, and includes a powerful and friendly text editor for performing common batch edits (delete all instances of this field, search using this regular expression, etc.) It has many features, but most useful to me recently has been its command line converter cmarcedit. It can perform marc-breaking and marc-making, including character set conversion, so is very handy to include in scripts that do MARC processing.

The Program

The tasks to be performed are:

  1. Copy the appropriate file(s) from the server to your local computer
  2. Extract the desired record

There’s no reason Step #1 couldn’t be performed with a interactive FTP client, or Step #2 with the regular MarcEdit interface, but in practice, I can’t be bothered to do all that clicking and path-remembering, so I wrote a bash script to handle it. That script is available from my Voyager github repository.

If you want to use the script on Windows, you’ll to install Cygwin and MarcEdit. (For linux and Mac, you can install the needed tools and the script will need minimal editing.) You’ll also need ssh access to your Voyager server, and information about where the files associated with your database are located on the server.

Install the script by saving it to your hard drive in a place that’s easy to access from Cygwin, such as your home directory. If you installed Cygwin in C:\local\cygwin\, your home directory would be C:\local\cygwin\home\<account>. Once it is saved there, type the following into the Cygwin bash shell to change permissions, making it executable:

    $ chmod 755 get_lost_marc.sh

Edit the file using your favorite text editor, such as Notepad or vim. The first several lines of the script should be modified to reflect your database parameters and locations of files on your system. You may wish to create a new directory for the script to dump the marc files in, such as C:\local\Voyager. After the file is modified, you can run it from the Cygwin bash shell prompt with a command like:

    $ ./get_lost_marc.sh

After you run the script successfully, there will be several .mrk files in the directory you specified. The files have descriptive names, so it should be clear where to find your records, for example:

  • replace.bib.mrk – bibliographic records that have been replaced
  • deleted.auth.mrk – authority records that have been deleted

These files can be viewed with any text editor, including MarcEdit’s MarcEditor, Notepad, or vim — keep in mind that this file may be large, as it contains every record to have been replaced/deleted possibly since your last upgrade. Once the file is open, you can search to find your record by whatever you know about it, such as its title (245) or bibliographic accession number (001). If you need to restore Voyager to its previous state, you can copy the text into its own file, compile it back to binary MARC with MarcEdit, and load that into Voyager with any of your usual methods (import through Cataloging module, bulk import, etc.)

Records lost in batch

If you do batch loading (which can also overlay records), the replaced records do not go into the main replace.bib.marc file; rather, they go into a file specific to the batch. When I started batch loading, I mainly looked at files like rpt/log.imp.20130214.0743 (which contains a log of the load), so was delighted to discover its siblings:

  • rpt/delete.imp.20130214.0743 – MARC records deleted in the batch process
  • rpt/discard.imp.20130214.0743 – MARC records discarded in the batch process (multiple overlay candidates)
  • rpt/err.imp.20130214.0743 – MARC records from the batch with errors (character encoding, format problems)
  • rpt/reject.imp.20130214.0743 – MARC records that could not be added (low quality)
  • rpt/replace.imp.20130214.0743 – MARC records replaced in the batch process

On the rare occasions I need one of these, I either muddle through the binary version on the server, or manually type the scp commands and view them with MarcEdit; if one needed them more frequently, the script could be modified to accept a date/time argument. (Please share if you make this version!)