One of the goals under the BV Museum’s Strategic Plan is increasing engagement with and access to our collection items in the digital realm. Digitization has been a primary focus of our Museum over the past few years, beginning with the launch of our Collections Online website in 2016 (thanks to funding from the Documentary Heritage Communities Program). With the launch of the site we were immediately able to make over 4900 historic photographs available online. We’ve grown our digital collections considerably since then, including adding another 400+ historic photos to the site, as well as PDFs of documents and newspapers, and over 1600 photographs of artifacts.

Our latest digitization project has focused on what we refer to as “oversized” items. “Oversized” records generally refer to maps, plan drawings, newspapers, or other oversized visual records. For our purposes, and this project, oversize refers to items that could not be scanned by our 12″x 17″ large format document scanner. This includes cemetery records and plans that were donated to the Museum’s Archives by the Town of Smithers last year. In 2018, we secured funding from the Bulkley Valley Community Foundation to have these oversized records scanned using the wide-format scanning services available at Bulkley Valley Printers.

Scan of the Smithers Sentinel from May 12, 1915.

Not every oversized item in our Archives will be digitized due to time and funding. Further, records were prioritized for digitization for three main reasons: current condition, content/subject matter, and public access. Some of the oversized items were scanned for all reasons; in the case of newspapers like the Hubert Times and Bulkley Valley Advertiser, Smithers Review, and Smithers Sentinel, where few, if any, copies exist in other museum and archive collections. As well, very few people realize such records exist in our Archives, so by digitizing these records we can bring more awareness to them and the information they hold.

Newspapers are notoriously hard to preserve because of the acidity of the paper they are printed on, yet they are of great interest to the public, documenting the earliest people, places, and businesses in Smithers. Thus, the newspapers in our archival collection were ideal candidates for digitization. Digitizing these records and making them accessible online ensures that they can be freely viewed by the public while also ensuring they are not regularly handled, which would contribute to quickening their deterioration.

In order to ensure that these oversized records would be safely fed through the wide-format scanner, they first had to be encapsulated. When items are encapsulated, they are encased between two sheets of archival-safe polyester film sheeting. This plastic sheeting is taped together using a non-off gassing double sided tape, leaving one or two sides open so that the record can “breathe” and be easily removed if necessary (unlike a process like lamination, which is permanent and never recommended for archival items). In some cases one plastic sleeve was reused for several uniform-sized items, such as the Pictorial newspapers. Items that were more fragile with tears, holes, or where the paper has been weakened over time by folding, were encapsulated in their own custom-made sleeves, and will be permanently stored in these sleeves. However, some oversized records that were too fragile to withstand being fed through a wide-format scanner (which uses rollers to feed the paper through), and could not be digitized at this time.

After scanning is complete the Museum receives high resolution JPEG files from BV Printers. Ideally, digital records should be saved as TIFFs, PDF/As, or other lossless formats. Lossless means no compression occurs when the digital record is saved, thus no pixels are lost. Since JPEGs are a lossy format, information (i.e. pixels) are discarded in order to produce a smaller sized file. Usually the loss isn’t perceptible to the user until the image is zoomed in on. Archives usually save digital scans using lossless formats (TIFFs, PDF/As, etc.), ensuring all the pixels/information is captured and saved, and then create compressed access copies (JPEGs, PDFs, etc.) for online access. However, digital storage space is just as precious as physical storage space, and lossless formats take up a lot of space on a hard drive. For instance, the JPEGs we received from BV Printers are at least 20 MB compared to the 0.2 MB JPEG files in our photo collection! Those may seem like small numbers but it adds up very quickly.

Items with printed text, such as newspapers are then processed through our optical character recognition (OCR) software, ABBYY Fine Reader Pro.

Screenshot demonstrating how scans are processed using ABBYY Fine Reader

OCR software enables digital items to be searchable, allowing researchers to find key words within the texts. It also allows our Collections Online search engine to search within the text of a document and determine whether it should be included in search results.

In this example, OCR has enabled this yearbook to come up as a search result for “Smithers High School”, as the search engine could find the relevant words in the text of the document.

 

Once items are run through the OCR software, they are checked and verified by a Museum staff member or volunteer. While ABBYY is fairly capable at capturing formatting, and recognizing words and phrasing, some human intervention is needed. Since ABBYY can best recognize clearly typed fonts (handwriting cannot be read at all), sometimes words are misread or completely missed. Formatting can also be an issue as ABBYY distinguishes between images and text (as seen in the photo above), so this also needs to be corrected by a person to ensure that images and paragraphs are recognized correctly. Thus, staff verifying the OCR records can inform the software which sections are images and which are text, as well as which words are missing so that the full text has been accurately captured and ready for searching.

In total 13 maps, 18 cemetery maps, 15 plan/architectural drawings, and 49 newspapers (13 of which are currently available online)were scanned through this project. All of the digitized items will be made available on our Collections Online website: search.bvmuseum.org.