

Image copyright (c) 1998 by
PhotoDisc, Inc.

Used under license.
|
Data Files: Finding & Resolving Y2K Problems
|
Overview: About Y2K problems in data files

Keeping Y2K problems out of your data files

Finding Y2K problems in data files

Resolving Y2K problems in data files
Related documents
Checking your data for Y2K problems
(article appearing in the November-December 1999 issue of
Berkeley Computing and Communications
)
Why You Should Enter Dates With Four-Digit Years Whenever Possible
Data Sharing Methods: Finding & Resolving Y2K Problems
Commercial Off-the-Shelf Applications: Finding & Resolving Y2K Problems
Overview: About Y2K problems in data files
|
Some data files may harbor Y2K-related and other date-related problems.
Some of these problems
might first surface when the year 2000 arrives or when dates falling in the
year 2000 and beyond are first used.
Examples of files which could
potentially be at risk include:
- Spreadsheets which contain dates or perform date calculations.
- Database files which contain dates or perform date calculations.
- Text files containing dates which may be used
by statistical or data analysis programs, imported into
database or spreadsheet files, or processed by
custom programs or scripts.
Keeping Y2K problems out of your data files
|
To help keep Y2K-related problems (and other date-related problems) out of
your data files, we strongly recommend these four practices:
- Configure your operating system to display full four-digit years.
When you do this, your application programs which use this display format
will also start displaying dates with four-digit years. This can help you in three ways:
- You can now see immediately, when you enter dates with two-digit years,
whether your program has assigned these dates to the 20th or 21st century.
- This can help prevent dates with century-ambiguous two-digit years from
being exported to text files or copied to the Clipboard. Such dates
can actually change centuries when they are imported into a different program.
- This may enable you to enter a wider range of dates in certain programs.
(Otherwise, you might be restricted to entering only dates falling
within a 100-year "window.")
For information on how to make this change - and for an
important discussion of some problems that can result from making this change - see
Changing Your Operating System's Default Date Format To Use Four-Digit Years
.
- Always enter dates with four-digit years.
If you enter dates with century-ambiguous two-digit years, such as
"05/13/29", which could reasonably fall within either the 20th or 21st centuries,
most spreadsheet and database programs, as well as some other types of programs,
will use a set of rules to guess which century you "probably meant" when you
entered this date.
Unfortunately, sometimes your programs will guess wrong, assigning certain
dates to an incorrect century. This could cause incorrect data to be stored
in your data files. You can avoid this problem altogether by
always entering dates with four-digit years, such as "05/13/1929" or "2029-05-13".
For more details, see
Why You Should Enter Dates With Four-Digit Years Whenever Possible
.
- Exercise caution when using built-in date functions in
spreadsheet and database programs.
Date functions such as Microsoft Excel's
Microsoft Excel's =DATE(), Microsoft Access's =DateSerial(),
and Lotus 1-2-3's @YEAR() may not always return the results you're
expecting when using dates which fall within the year 2000 and beyond.
These date functions can be especially problematic when you use them with
21st century dates containing only two-digit years, such as "00" or "01".
If you use these functions without fully understanding how they handle 21st century
dates, you could introduce incorrect data into your spreadsheets or database files.
For more details, see the
Finding Y2K problems in data files
section
of this document,
Data Files: Finding & Resolving Y2K Problems
.
- Exercise caution when importing dates into your data files.
When you import data containing dates into your files, make sure these dates
are represented in an unambiguous manner.
For example, if the data you're importing contains dates stored as text,
make sure that these dates either:
- Contain full four-digit years, such as "05/13/1929" or "05/13/2029";
or
- Are represented in some other manner which allows your program to unambiguously
determine which century these dates belong to.
This is especially critical in the case of data imported from text files,
pasted in from the Clipboard, or read in via programmatic data streams.
If this data contains century-ambiguous dates with two-digit years,
such as "05/13/29", your program could potentially interpret such dates as
falling into an incorrect century, thus causing incorrect data to be
stored in your data files.
For more details, see
Data Sharing Methods: Finding & Resolving Y2K Problems
.
Finding Y2K problems in data files
|
There are two primary ways of finding date-related problems in data files: you can
manually scan these files or you can use automated software tools to assist you.
- Manually scanning your data files.
If you have only one or two relatively simple files that you must examine
for Y2K problems, you can consider manually scanning these files.
For example, some spreadsheet programs include a feature that allows you
to search for specific text within your spreadsheet data, formulas, and
macros or scripts.
When using recent versions of Microsoft Excel, for instance, you could
locate potentially problematic uses of the built-in =DATE() function
by searching for the partial text DATE( within a spreadsheet's formulas.
- Using automated software tools to help you scan your data files.
If you have even a modest number of critical data files, or any of your files
are relatively complex, we highly recommend that you use automated software tools
to scan these files. Even if you have a few, simple files to check, using an
automated tool can often provide faster and more accurate results when compared
with manual scanning.
Software Tools for Finding & Resolving Y2K Problems
lists a number of
tools that you might consider using for this task.
Some "data file scanning" tools specialize in checking for problems
in only one type of data file: for example, Microsoft Excel spreadsheets or
Microsoft Access databases. Other tools may be able to check a number of
different types of commonly-encountered data files.
In addition, some data file scanning tools are 'standalone' programs that
just perform this single task, while other tools are bundled with Y2K
tool suites.
These tool suite products can find and resolve a variety of
Y2K problems in such areas such as hardware, operating systems,
and applications, in addition to identifying potential problems in your data files.
Whether you're manually scanning your data files or using a tool to assist you,
you might begin your search for potential Y2K problems by looking for any instances
of the following:
- Presence of custom macros, scripts, and user-defined functions.
A small percentage of spreadsheet and database files may incorporate custom-written
macros, scripts, or user-defined functions which improperly handle dates.
As a result, all of the Y2K risks more typically associated with
customized applications
may also apply
to such data files.
For example, JD Consulting's excellent
YEAR 2000 Personal Computer Fix-It Guide
makes the bold claim that "Every spreadsheet should be thought of as a standalone application."
(This quote is reprinted with permission. Copyright 1999. Charles River Media.)
We concur: since spreadsheets may contain
formulas, user-defined functions, macros, and other custom programming,
a complex spreadsheet may often behave in ways more
commonly associated with a custom-written application than a data file.
Recent versions of Microsoft's Office applications, for
example, incorporate a powerful programming language, Visual Basic for Applications (VBA).
VBA code modules could potentially be found in any
Microsoft Excel 97 or 98 spreadsheet.
Fortunately, it is generally the case that only data files created by programmers
and advanced spreadsheet and database users are likely to contain such sophisticated features.
(Of course, a few of these files could end up being distributed throughout a
department or organization, or they could be stored on a server and then be widely accessed.)
As Allen B. Falcon, President of the consulting firm
Horizon Information Group, Inc., pointed out in an August 19, 1999 posting to the
comp.software.year-2000.tech newsgroup
:
Our client/customer base indicates that fewer than 2% (3-5% in
financial analysis departments) use VBA programming in their spreadsheets and databases.
Nonetheless, if a data file is important to key departmental or campus functions, and
it contains custom macros, scripts, or user-defined functions
that deal with dates in any way,
this file will need to be examined for potential problems.
- Use of standard date functions.
Many spreadsheet and database programs provide standard date-related
functions which are known to have Y2K-related "usage issues," such as Microsoft Excel's
=DATE(), Microsoft Access's =DateSerial(),
and Lotus 1-2-3's @YEAR().
Two examples:
- When using Microsoft Excel's
DATE() function,
DATE(05,13,00)
will be interpreted as May 13, 1900, rather
than being interpreted as May 13, 2000, as some Excel users might expect.
(To further complicate matters, with Excel for the Macintosh,
DATE(05,13,00), and more generally, the
DATE function when used with years in the range from 1900 through 1903,
will usually generate the error value #NUM!.)
- When using the @YEAR() function in least some versions of Lotus 1-2-3,
when the function's sole argument is a reference to a cell containing a
date in the year 2000, it will return the three-digit value
100, not the four-digit year
2000 that some 1-2-3 users might expect.
To avoid these problems, read the Y2K compliance statements for your critical
applications carefully. Most vendors will detail any such date-related issues with their
programs' date functions. If at all possible, make sure that other users of these
spreadsheet or database application within your department become aware of these issues.
For suggestions that may help you locate Y2K compliance statements for your important
application programs, see
Commercial Off-the-Shelf Applications: Resources for Identifying Y2K Compliance
and
Y2K Compliance Status of Selected Off-the-Shelf Applications Used at UC Berkeley
.
- Dates stored as text or numbers.
Dates stored in spreadsheet cells or database fields
as text (e.g. "02/21/15") or integers (e.g. "990405") - especially, but not
exclusively, dates with century-ambiguous two-digit years - may
present risks when these values are converted (via standard
date-related functions or otherwise) to dates, or
are compared or sorted.
To avoid these problems:
- Always enter dates into "date" (or "general") formatted spreadsheet cells, or
into "date" fields in database files.
Avoid storing dates in "text", "number", "integer", or similar cells or fields.
- Whenever possible enter dates with full, four-digit years (e.g.
05/13/2029) into spreadsheets,
database files, and other such data files.
If you enter dates with two-digit years (e.g. 05/13/29),
many application programs use
a "windowing" technique to decide which century -- the 20th or 21st -- to which
to assign these dates. This technique is essentially an automated method of
deciding which century you "probably meant," but occasionally this technique
guesses wrong.
To read more about this topic, see
Why You Should Enter Dates With Four-Digit Years Whenever Possible
.
- Never enter partial dates (for example "4/1", intending to refer to April 1st of
the current year) and rely on your spreadsheet or database program to
automatically convert these to full dates (i.e. date serial numbers).
Your program's behavior may not always
be what you expect. When the year 2000 arrives, certain
partial dates could end up being assigned to the wrong year or month, or
even might not be interpretable as valid dates and thus could generate
errors or be stored as text, rather than as dates.
- Data exchanged via text files.
Data exchanged between programs using text files, as well as via other means
such as the Clipboard or programmatic data exchanges, can sometimes be subject to
date-related problems. This is particularly likely if such data contains dates which
are stored or displayed with only
two-digit years (e.g. "05/13/29") or using other century-ambiguous date formats.
To learn more about this area of potential date-related problems, see
Data Sharing Methods: Finding & Resolving Y2K Problems
.
For additional suggestions about what types of problems to look for in your data files,
you might start by reviewing articles such as the following:
Patrick O'Beirne
Year 2000 and Spreadsheets
Patrick O'Beirne
Year 2000 issues in PC Database packages
Dan Haught, FMS, Inc.
Solving the Year 2000 Problem in Microsoft Desktop Application Programs
Luke Chung, FMS, Inc.
Year 2000 Issues in Microsoft Access
Resolving Y2K problems in data files
|
There is no "magic bullet" that will automatically resolve all of your date-related problems
in your critical data files.
As noted above,
automated software tools
can assist you in this task by scanning your data files and helping you identify potential
problems. Most tools will be able to tell you which spreadsheet cells, or which
database records and fields, have potential date-related problems, and provide
you with information about the nature of these problems. Some more
sophisticated tools may even be
able to open your spreadsheet or database files for you, and allow you to
directly edit the cells or fields in which potential problems were detected.
However, even the best tools still primarily rely on human judgment when fixing these problems.
For example,
a tool might be able to detect a problematic use of a date-related function in particular
cell in a spreadsheet document, but it will still be the responsibility of the
the author or maintainer of that spreadsheet to determine how best to fix this problem.
Start early
We've seen reports from a number of sites indicating that large numbers of
spreadsheet and database files at those sites have been identified as having potentially moderate
to serious date-related problems.
(These problems were typically spotted by data file scanning
tools and then a sampling of these problems were confirmed by manual inspection.)
This experience might potentially be the case with at least some computers used for
campus business at UC Berkeley. For this reason, it is important that you
begin the task of scanning key data files for date-related problems
relatively early in your departmental or personal Y2K efforts. If you have
tens, hundreds, or even thousands of data files which contain or process dates,
and many of these files have date-related problems, it could take a long time to
resolve these problems.
Some tips for managing the task of checking and fixing data files
- Separate active files from those which are no longer being used.
When starting out, it's often very helpful to perform some cleanup.
You might separate out files which:
- Are no longer used;
- Are exact duplicates of other files;
- Represent successive past versions of a current file; and/or
- Are variant forms of a primary, current file.
You might move these files to a special folder or directory to help
you avoid wasting effort scanning these files.
- Focus repair work on just the highest priority files.
Work to resolve problems first on the files which are of the highest
importance to your department's or the campus's mission, then tackle
other files when you have time to do so.
- Use your tool's capabilities (if any) to help you identify the
highest priority problems in your files.
Some data file scanning tools will rank potential date-related problems
by assigning a (tool-calculated) priority or severity level to entire files
or to individual problems identified within these files.
Use this feature to help you focus on the problems which most urgently
require fixing.
- Focus on date functions and calculations.
If you're checking spreadsheets for Y2K problems, focus your efforts on formulas
which contain built-in date functions or otherwise process dates.
The same advice is equally appropriate for any types of data files which include
custom programming, such as user-defined functions, scripts, or
macros in spreadsheets and scripts or programs associated with databases.
Concentrate your efforts on just those
areas of your data files which use built-in date functions or otherwise
explicitly process dates.
- Don't waste time "fixing" dates which are visually displayed with two-digit years.
Dates which are displayed with century-ambiguous two-digit years,
such as 05/13/29,
are often flagged as problematic by data file scanning tools. However, even if
many dates are visually displayed with two-digit years,
such dates are often internally stored in spreadsheets or database files
in a manner which accurately preserves four-digit year information.
(In some types of spreadsheets, for example, dates are typically stored as
"serial dates": the number of days past a starting or
"epoch" date.)
These dates can be safely used in date calculations, regardless of
how they are displayed.
It is true that dates which are stored internally
with four-digit year information but
are displayed with only two-digit years could potentially cause problems.
Most commonly, these problems will occur if these dates
they are exported as text or via the Clipboard. This is because
some programs use the as-displayed format, rather than the internal storage format,
when exporting data via these methods.
However, it's not all that common to share spreadsheet or
database data using text files or the Clipboard. More typically,
spreadsheet and database files are passed around via disks or e-mail, or
shared via file servers.
In addition, 'dates' in spreadsheets or database files
which are stored in text or numeric formats and are
century-ambiguous could cause a variety
of date-related problems. However, you won't encounter these problems
unless these 'non-date dates' are actually used in date calculations.