Skip to main content

Archives Aren’t Backups: Storing Data for the Future

Visalia Direct: Virtual Valley
January 24, 2011 Deadline
March 2011 Issue

Archives Aren’t Backups: Storing Data for the Future

Do you remember WordStar? Lotus 1-2-3? Harvard Graphics? If you’ve been using computers as long as I have, you created documents, spreadsheets and graphics in too many applications to remember.

Yes, I have 25-year-old data. I have copied those files from floppies to Iomega Zip disks, from Zip disks to CDs, and most recently from CDs to a trio of external hard drives. Each time I upgrade computers, I migrate data to whatever happens to be the leading archival format.

I migrate data every two to four years. That is important, because media do fail. However, what has enabled me to use old documents is a habit of storing data in two or three formats. In my “Documents” directory, I have created folders named “Archives of…” to store data in neutral formats.

Recently, I wanted to use an old image created in a DOS-based application. I tried several applications, but none of them could import the image file. Thankfully, I had thought ahead nearly twenty years ago and stored copies of the image in standard formats.

Before discussing archival storage, let me share my current data backup strategy.

My wife and I have external hard drives attached to our systems for continuous backups of our data. In addition to Apple’s Time Machine, we copy some documents to an “iDisk” in the “cloud.”

Cloud computing refers to remote storage. Copies of our data reside on servers at remote locations for safety and convenience. Apple, Google and Microsoft maintain “server farms” in several states. When my wife or I copy a file to the iDisk, we are really sending a copy of the file to one of Apple’s servers. We are depending on Apple or Google to not lose our copies. Most of my manuscripts are on the iDisk, as well as being on the external hard drive that sits on my desk. I don’t store any financial data remotely.

The size of files, especially multimedia data, limits the practicality of storing backups to CDs or DVDs. External hard drives are larger and relatively cheap.

Now let me offer some tips to ensure your data are useful for years or even decades.

Each software developer believes their programmers have developed the “best” format for saving your letters, photos, or address book. Also, new technologies mean different types of data need to be stored by new and improved applications. WordStar for DOS didn’t need to store font information, for example, because there was one typeface. It was a glorified typewriter. Today’s word processors have to store font changes, images, and dozens of other layout elements no simple text-editing program could handle.

New features lead to new file formats. As most of us learn the hard way, the old versions of our favorite applications cannot open files created by the new versions. Normally, Microsoft’s Word 2000 cannot open a Word 2007 document. Forget trying to open a poster created with the Adobe Illustrator CS5 in Illustrator CS2.

Though you will sacrifice some features, you should save data in standard file formats for long-term archival purposes. These archival files are meant to store important information for emergencies. Standard file formats are good for sharing and archiving “raw” data, not for storing documents or other data for daily work.

In most applications, you can create archival versions of data with “File, Save As” or “File, Export” menu choices. Read the online help for each specific program to determine how you can export different file formats.

Word processing documents should be archived as “plain text” files, which usually have the file extension “TXT” at the end. A file extension informs applications what data format to expect. An alternative archive format for documents, which will store some formatting features, is “rich text format” (RTF). The Microsoft Word file formats, known as DOC and DOCX, are not standardized and should not be used for archives.

Spreadsheets and databases usually support a file format known as “comma-separated values” (CSV). When you export data to a CSV file, only the raw data are stored. All functions, calculations, and other features are lost. Thankfully, CSV data can be imported into almost every spreadsheet and database application in existence.

Unfortunately, there is no open standard format for images that works flawlessly. I suggest storing images in Portable Network Graphics (PNG) and Joint Photographic Experts Group (JPEG or JPG) formats. Some programs do not open PNG files properly, while JPEG is a proprietary format that works by “losing” data to store smaller files.

If you aren’t storing important data in archival formats, you should consider it. Today’s leading word processor or spreadsheet application might vanish tomorrow, as WordStar and Lotus 1-2-3 users know all too well.

Archival Tips:

1) Store archival data in standard, open formats that several applications can read and write. Avoid archiving data in formats limited to one software developer.
2) Every time you upgrade computer systems, make sure the new system can read your old data. This is particularly important with DVDs and CDs, but even external hard drives can have issues on new systems. If your aging external drive requires a FireWire 400 port and your new computer only has USB 3.0 ports, you’ll need a plan to migrate the data.
3) If you have data on CD-RW or DVD-RW discs, copy the data to write-once CD-R media. The “read-write” disc media (RW) have short lives of one to two years, compared to 15 years or more for write-once CD-R discs. Data DVDs have five-year lifespans.
4) Flash USB drives, solid-state drives (SSDs), and similar devices are not good archival media. Tests reveal they experience data loss more easily than hard drives or disc media.
5) A hard drive is the best external backup currently available, based on cost per megabyte and expected lifespan. Most hard drives are designed to last at least five years; you’ll be upgrading long before a good drive dies.


Popular posts from this blog

What I Studied in Graduate School

Lower case ‘a’ from Adobe Caslon Pro, superposed onto some guides. (Photo credit: Wikipedia) Asked to summarize my research projects...

Curiously, beyond the theses and dissertation, all my work is in economics of media and narrative. I ask what works and why when offering stories to audiences. What connects with an audience and can we model what audiences want from narratives? (Yes, you can model data on narratives and what "sells" and what wins awards and what nobody wants.)

Yet, my degree research projects all relate to design of writing spaces, as knowing what works is also key to knowing what could be "sold" to users.

MA: How poor LMS UI/UX design creates online spaces that hinder the writing process and teacher mentoring of students.

Also: The cost of LMS design and compliance with legal mandates for usability.

Ph.D: The experiences of special needs students in online settings, from commercial spaces to games to learning spaces and which spaces are best desig…

Comic Sans Is (Generally) Lousy: Letters and Reading Challenges

Specimen of the typeface Comic Sans. (Photo credit: Wikipedia) Personally, I support everyone being able to type and read in whatever typefaces individuals prefer. If you like Comic Sans, then change the font while you type or read online content. If you like Helvetica, use that.

The digital world is not print. You can change typefaces. You can change their sizes. You can change colors. There is no reason to argue over what you use to type or to read as long as I can use typefaces that I like.

Now, as a design researcher? I'll tell you that type matters a lot to both the biological act of reading and the psychological act of constructing meaning. Statistically, there are "better" and "worse" type for conveying messages. There are also typefaces that are more legible and more readable. Sometimes, legibility does not help readability, either, as a type with overly distinct letters (legibility) can hinder word shapes and decoding (readability).

One of the co…

MarsEdit and Blogging

MarsEdit (Photo credit: Wikipedia) Mailing posts to blogs, a practice I adopted in 2005, allows a blogger like me to store copies of draft posts within email. If Blogger, WordPress, or the blogging platform of the moment crashes or for some other reason eats my posts, at least I have the original drafts of most entries. I find having such a nicely organized archive convenient — much easier than remembering to archive posts from Blogger or WordPress to my computer.

With this post, I am testing MarsEdit from Red Sweater Software based on recent reviews, including an overview on 9to5Mac.

Composing posts an email offers a fast way to prepare draft blogs, but the email does not always work well if you want to include basic formatting, images, and links to online resources. Submitting to Blogger via Apple Mail often produced complex HTML with unnecessary font and paragraph formatting styles. Problems with rich text led me to convert blog entries to plaintext in Apple Mail and then format th…