SOFTWARE HORROR
STORIES
The time is now
- The Mars Climate Orbiter crashed in
September 1999 because of a "silly mistake": wrong units in a program. Story Story Report
- The 1988 shooting down of the Airbus 320
by the US Vicennes was attributed to the cryptic and misleading output
displayed by the tracking software. Story
- Death resulted from inadequate testing
of the London Ambulance Service software. Story
- Several 1985-7 deaths of cancer patients
were due to overdoses of radiation resulting from a race condition between
concurrent tasks in the Therac-25 software. Report Report
Story More
More
More
- Errors in medical software have caused
deaths. Details in B.W. Boehm, "Software and its Impact: A Quantitative
Assessment," Datamation, 19(5), 48-59(1973).
- An Airbus A320 crashes at an air show.
Story
- A China Airlines Airbus Industrie A300
crashes on April 26, 1994 killing 264. Recommendations include software
modifications. Summary
- The British destroyer H.M.S. Sheffield
was sunk in the Falkland Islands war. According to one report, the ship's
radar warning systems were programmed to identify the Exocet missile as
"friendly" because the British arsenal includes the Exocet's homing device and
allowed the missile to reach its target, namely the Sheffield. From "The
development of software for ballistic-missile defense," by H. Lin,
Scientific American, vol. 253, no. 6 (Dec. 1985), p. 48.
- An error in an aircraft design program
contributed to several serious air crashes. From P. Naur and B. Randell, eds.,
Software Engineering: Report on a Conference Sponsored by the NATO Science
Committee, Brussels, NATO Scientific Affairs Division, 1968, p.
121.
- An Air New Zealand airliner crashed into
an Antarctic mountain; its crew had not been told that the input data to its
navigational computer, which described its flight plan, had been changed. From
"The development of software for ballistic-missile defense," by H. Lin,
Scientific American, vol. 253, no. 6 (Dec. 1985), p. 52.
- The Ariane 5 satellite launcher
malfunction was caused by a faulty software exception routine resulting from a
bad 64-bit floating point to 16-bit integer conversion. Report
Story Story
- During the maiden flight of the
Discovery space shuttle, 30 seconds of (non-critical) real-time telemetry data
was lost due to a problem in the requirement stage of the software development
process. Story
- A train stopped in the middle of nowhere
(London' Docklands Light Railway) due to future station location changes after
the software was deployed and reluctance to change the software. Story
- The Dallas/Fort Worth air-traffic system
began spitting out gibberish in the Fall of 1989 and controllers had to track
planes on paper. "Ghost in the Machine," Time Magazine, Jan. 29, 1990.
p. 58. Story
- Several Space Shuttle missions have been
delayed due to hardware/software interaction problems. Story
- An airplane software control returned
inappropriate responses to pilot inquiries during abnormal flight conditions.
Story
- The Pathfinder reset problem. StoryMore
- An Iraqi Scud missile hit Dhahran
barracks, leaving 28 dead and 98 wounded. The incoming missile was not
detected by the Patriot defenses, whose clock had drifted .36 seconds during
the 4-day continuous siege, the error increasing with elapsed time since the
system was turned on. This software flaw prevented real-time tracking. The
specifications called for aircraft speeds, not Mach 6 missiles, for 14-hour
continuous performance, not 100. Patched software arrived via air one day
later. From ACM SIGSOFT Software Engineering Notes, vol 16, #3. See Story More More More
- Bug-infested [air traffic control
software] was scoured by software experts at Carnegie-Mellon and the
Massachusetts Institute of Technology to determine whether it could be
salvaged or had to be canceled outright. Story
- Were a missile to approach at a certain
tricky angle (all) 27 programs would fail to shoot it down. Story
- The Apollo 8 spacecraft erased part of
the computer's memory. From G. J. Myers, Software Reliability: Principles
& Practice, p. 25.
- Eighteen errors were detected during the
10-day flight of Apollo 14. From G. J. Myers, Software Reliability:
Principles & Practice, p. 25.
- A 1963 NORAD exercise was incapacitated
because a software error caused the incorrect routing of radar information.
From G. J. Myers, Software Reliability: Principles & Practice, p.
25.
- The U.S. Strategic Air Command's 465L
Command System, even after being operational for 12 years, still averaged one
software failure per day. From G. J. Myers, Software Reliability:
Principles & Practice, p. 25.
- An error in a single FORTRAN statement
resulted in the loss of the first American probe to Venus. From G. J. Myers,
Software Reliability: Principles & Practice, p. 25.
- On June 3, 1980, the North American
Aerospace Defense Command (NORAD) reported that the U.S. was under missile
attack. The report was traced to a faulty computer circuit that generated
incorrect signals. If the developers of the software responsible for
processing these signals had taken into account the possibility that the
circuit could fail, the false alert might not have occurred. From "The
development of software for ballistic-missile defense," by H. Lin,
Scientific American, vol. 253, no. 6 (Dec. 1985), p. 48.
- The manned space capsule Gemini V missed
its landing point by 100 miles because its guidance program ignored the motion
of the earth around the sun. From "The development of software for
ballistic-missile defense," by H. Lin, Scientific American, vol. 253,
no. 6 (Dec. 1985), p. 49.
- Five nuclear reactors were shut down
temporarily because a program testing their resistance to earthquakes used an
arithmetic sum of variables instead of the square root of the sum of the
squares of the variables. From "The development of software for
ballistic-missile defense," by H. Lin, Scientific American, vol. 253,
no. 6 (Dec. 1985), p. 49.
- In a 1977 exercise, when it was
connected to the command-and-control systems of several regional commands, the
WWMCCS had an average success rate for message transmission of only 38
percent. From "The development of software for ballistic-missile defense," by
H. Lin, Scientific American, vol. 253, no. 6 (Dec. 1985), p.
51.
- Aegis was installed on the U.S.S.
Ticonderoga, a Navy cruiser. After the Ticonderoga was commissioned the weapon
system underwent its first operational test. In this test it failed to shoot
down six out of 16 targets because of faulty software; earlier small-scale and
simulation tests had not uncovered certain system errors. In addition, because
of test-range limitations, at no time were more than three targets presented
to the system simultaneously. For a sizable attack approaching Aegis' design
limits the results would most likely have been worse. From "The development of
software for ballistic-missile defense," by H. Lin, Scientific American
, vol. 253, no. 6 (Dec. 1985), p. 51.
- On June 19, 1985 the Strategic Defense
Initiative Organization performed a simple experiment: The crew of the space
shuttle was to position the shuttle so that a mirror mounted on its side could
reflect a laser beamed from the top of a mountain 10,023 feet above sea level.
The experiment failed because the computer program controlling the shuttle's
movements interpreted the information it received on the laser's location as
indicating the elevation in nautical miles instead of feet. As a result the
program positioned the shuttle to receive a beam from a nonexistent mountain
10,023 nautical miles above sea level. From "The development of software for
ballistic-missile defense," by H. Lin, Scientific American , vol. 253,
no. 6 (Dec. 1985), p. 51.
- The first operational launch attempt of
the space shuttle, whose real-time operating software consists of about
500,000 lines of code, failed because of a synchronization problem among its
flight-control computers. The software error responsible for the failure,
which was itself introduced when another error was fixed two years earlier,
would have revealed itself, on the average, once in 67 times. From "The
development of software for ballistic-missile defense," by H. Lin,
Scientific American, vol. 253, no. 6 (Dec. 1985), p. 52.
- "The change was so simple he didn't feel
he had to inform anyone that it took place and the mistake he made was so
stupid. He had no idea of the damage it would caused." The day after the
product shipped 50 beta testers called and reported that all the paychecks
were being printed at zero dollars. Story
- The Sendmail security bug. Story
- INTEL processor bugs galore. List Pentium
discussion
- A computer-monitored house arrest inmate
escaped and subsequently committed murder. This was caused by the reporting
software not re-trying when it received a busy signal at the main computer
number. Story
- The clock in the video camera indicated
a customer had withdrawn his money at the same time as a fraud occurred, so
the bank forwarded his photo to the authorities. The clock had been off by
about one hour. Story
- The nine-hour breakdown of AT&T's
long-distance telephone network in Jan. 1990, caused by an untested code
patch, dramatized the vulnerability of complex computer systems everywhere.
"Ghost in the Machine," Time Magazine, Jan. 29, 1990. p. 58. Story
- On July 1-2, 1991, computer-software
collapses in telephone switching stations disrupted service in Washington DC,
Pittsburgh, Los Angeles and San Francisco. Once again, seemingly minor
maintenance problems had crippled the digital System 7. About twelve million
people were affected in the crash of July 1, 1991. Said the New York Times
Service: "Telephone company executives and federal regulators said they
were not ruling out the possibility of sabotage by computer hackers, but most
seemed to think the problems stemmed from some unknown defect in the software
running the networks." Within the week, a red-faced software company, DSC
Communications Corporation of Plano, Texas, owned up to glitches in the signal
transfer point software that DSC had designed for Bell Atlantic and Pacific
Bell. The immediate cause of the July 1 crash was a single mistyped character:
one tiny typographical flaw in one single line of the software. One mistyped
letter, in one single line, had deprived the nations capital of phone service.
It was not particularly surprising that this tiny flaw had escaped attention:
a typical System 7 station requires ten million lines of code. From The
Hacker Crackdown, by Bruce Sterling, 1992. Story
More More More
- During a payday rush in 1989, a faulty
program shut down 1,800 automated-teller machines at Tokyo's Dai-Ichi Kangyo
Bank. "Ghost in the Machine," Time Magazine, Jan. 29, 1990. p. 58. Story
- When an airline's reservation system
went down in 1989, 14,000 travel agents had to book flights manually. "Ghost
in the Machine," Time Magazine, Jan. 29, 1990. p. 58. Story
- In the early 1980s, Buick had to give
80,000 V6 cars a chip transplant to fix flaws in their microprocessors. "Ghost
in the Machine," Time Magazine, Jan. 29, 1990. p. 58. Story
- The New York Stock Exchange opened one
hour late on Dec. 18, 1995 due to a communications problem in the software. Story
- Chemical Bank went down for 5 hours on
July 20, 1994 due to a file update overloading the computer system. Story
- There was a San Francisco 911 system
crash of over 30 minutes on Oct. 12, 1995. Patched but not fixed, it still
misses between 100-200 calls per day. Story
- The hole in Ozone layer over Antartica
left undetected for extended period because data was considered anomalous by
software because it was out of the specified range. Story
- The Denver airport stayed closed for
over a year due to software glitches in the automated baggage handling system.
Story More
- Bell Atlantic Corp. failed to bill
approximately 400,000 AT&T customers in parts of Virginia, Maryland,
Washington D.C., and West Virginia for their long-distance calls on their
January 1998 bill. AT&T stated that their Operations Support Systems
provided Bell Atlantic with the correct billing data for three of the twenty
billing cycles, customer's billed on the 2nd, 4-5th, and 7th of the month, and
that a Bell Atlantic computer error failed to produce the AT&T portion of
the bill. Bell Atlantic has stated that the problem was a "systems glitch",
"processing error", and/or "data processing error". [Supposedly, computer
tapes were used to transfer the billing details between AT&T and Bell
Atlantic.] From an AT&T press release, dated 16-Jan-1998, reprinted in the
Richmond Times-Dispatch, 17 Jan 1998, p. C10.
- Oodles of software will fail in the year
2000. Story More
More
Lots more
- The IRS uncovered an unintended side
effect of its effort to eliminate the Year 2000 computer bug: About 1,000
taxpayers who were current in their tax installment agreements were suddenly
declared in default due to a programming error. [There are 62 million lines of
source code to check; the error was caused by an attempted Y2K fix.] From the
Associated Press newswire (AP US & World, 23 Jan 1998, by Rob
Wells).
- An alert to all National Association of
Miniature Enthusiasts (NAME) members: A member recently called the office to
find out why she hasn't received her Houseparty Gazette. She discovered
that the computer has deactivated ALL members whose memberships expire in the
year 2000 and beyond. Kim ... said she had no way of knowing who those folks
are unless they call her and let her know. From the rec.arts.dollhouses
newsgroup.
- One production line shut down when the
laser-driven printer putting "sell-by" dates on products couldn't handle the
2000 date. Industry Week, Jan. 5, 1998, p. 26.
- Many programs err in, or simply ignore,
the century rule for leap years on the Gregorian calendar (every 4th year is a
leap year, except every 100th year which is not, except every 400th year which
is). For example, early releases of the popular spreadsheet program Lotus
1-2-3 treated 2000 as a non-leap year, a problem eventually fixed. But, all
releases of Lotus 1-2-3 take 1900 as a leap year; by the time this error was
recognized, the company deemed it too late to correct: ``The decision was made
at some point that a change now would disrupt formulas which were written to
accommodate this anomaly''. Excel, part of Microsoft Office, has the same
flaw. From Calendrical
Calculations , N. Dershowitz and E. M. Reingold, p.
xviii.
- The New York City Taxi and Limousine
Commission chose March 1, 1996 as the start date for a new, higher fare
structure for cabs. Meters programmed by one company in Queens forgot about
the leap day and charged customers the higher rate on February 29. The New
York Times, March 1, 1997.
- A computer software error at the Tiwai
Point aluminum smelter in Southland, New Zealand at midnight on New Year's Eve
1997 caused more than $AU 1 million of damage. The software error was the
failure to account for leap years (and considering a 366th day in the year to
be invalid), causing 660 process control computers to shut down and the
smelting pots to cool. The same problem occurred two hours later at Comalco's
Bell Bay smelter in Tasmania (which is two hours behind New Zealand). The
general manager of operations for New Zealand Aluminum Smelters, David Brewer,
said ``It was a complicated problem and it took quite some time [until
midafternoon] to find the cause.'' The New Zealand Herald , January 8,
1997, and The Dominion, in Wellington, New Zealand.
- A "computer error" is blamed for a false
report of three death by an incurable disease when a woman killed her daughter
and tried to kill her son and herself. From ACM SIGSOFT Software
Engineering Notes, vol. 10, no. 3
- A Norwegian class gets a pornographic
image because of cache problem, when a recycled link leads to a pornographic
site. From Internet Risks Forum NewsGroup (RISKS), vol. 19, issue
47.
- Computers were blamed when, in three
separate incidents, 3 million, 5.4 million, and 1.5 million gallons of raw
sewage were dumped into Willamette River. From ACM SIGSOFT Software
Engineering Notes, vol. 13, no. 3.
- The U.S. national EFTPOS system crashed
on 2 Jun 1997 for two hours and 100K transactions were "lost". One central
processor failed and backup procedures to redistribute the load also failed.
From Internet Risks Forum NewsGroup (RISKS), vol. 19, issue
21.
- Computer blunders were blamed for $650M
student loan losses. From ACM SIGSOFT Software Engineering Notes , vol.
20, no. 3.
- An Internet routing "black hole" cuts
off ISPs; MAI Network Services routing table errors directed 50,000 routing
addresses to MAI; InterNIC goofed, as well, 23 Apr 1997. From ACM SIGSOFT
Software Engineering Notes, vol. 22, no. 4.
- Votes were lost by a computer in
Toronto. The Toronto district finally abandoned computerized voting, leaving a
year-old race unresolved. From ACM SIGSOFT Software Engineering Notes ,
vol. 15, no. 2.
- A cat was registered as a voter to
demonstrate risks (no pawtograph required). From ACM SIGSOFT Software
Engineering Notes, vol. 20, no. 1.
- A "read-ahead" synchronization glitch
and/or an eager operator caused a large data entry error, and the wrong winner
was announced in a Rome, Italy city election. From ACM SIGSOFT Software
Engineering Notes, vol. 15, no. 1.
- In a German parliament election, the
program rounds up the Greens' 4.97%, which was less than the 5% cutoff; when
corrected, the Social Democrats attained a one seat majority. From ACM
SIGSOFT Software Engineering Notes, vol. 17, no. 3.
- An Oregon computer error reversed
election results. From ACM SIGSOFT Software Engineering Notes, vol. 18,
no. 1.
- A (CTSS) raw password file was
distributed as message-of-the-day, due to an editor temporary file name
confusion. See Morris and Thompson, CACM 22, 11, Nov
1979.
- The U.S. Social Security Administration
systems could not handle non-Anglo names, affecting $234 billion for 100,000
people, some going back to 1937. From Internet Risks Forum NewsGroup
(RISKS) , vol 18, issue 80.
- Software prevented the correction of a
recognized Olympic skating scoring error. From ACM SIGSOFT Software
Engineering Notes, vol. 17, no. 2.
- A computer scoring glitch at an Olympic
boxing match causes the evident winner to lose. From ACM SIGSOFT Software
Engineering Notes, vol. 17, no. 4.
- A man's auto insurance rate triples when
he turns 101 (= 1 mod 100). From ACM SIGSOFT Software Engineering
Notes, vol. 12, no. 1.
- A Montreal life insurance company dies
due to software bugs in its integrated system. From ACM SIGSOFT Software
Engineering Notes, vol. 17, no. 2.
- A computer test residue generates a
false tsunami warning in Japan. From ACM SIGSOFT Software Engineering
Notes, vol. 19, no. 3.
- Chicago cat owners were billed $5 for
unlicensed dachshunds. A database search on "DHC" (for dachshunds) found
"domestic house cats" with shots but no license. From ACM SIGSOFT Software
Engineering Notes, vol. 12, no. 3.
- The Korean Airlines KAL 901 accident in
Guam killed 225 out of 254 aboard. A worldwide bug was discovered in
barometric altimetry in Ground Proximity Warning System (GPWS). From ACM
SIGSOFT Software Engineering Notes, vol. 23, no. 1.
- A "computer error" affected hundreds of
U.K. A-level exam results. From Internet Risks Forum NewsGroup (RISKS),
vol. 19, issue 40.
- The Paris police computer mismatched a
Corsican city code with postal code, and was unable to collect motorists'
fines. From Internet Risks Forum NewsGroup (RISKS), vol. 19, issue
41.
- Netscape Communicator 4.02 and 4.01a
allowed disclosure of passwords. From Internet Risks Forum NewsGroup
(RISKS), vol. 19, issue 34.
- A bank robbery "wanted" poster of the
wrong person was due to an unchecked match. From Internet Risks Forum
NewsGroup (RISKS), vol. 19, issue 29.
- The Soviet Phobos I Mars probe was lost,
due to a faulty software update, at a cost of 300 million rubles. Its
disorientation broke the radio link and the solar batteries discharged before
reacquisition. From Aviation Week, 13 Feb 1989.
- An F-18 fighter plane crashed due to a
missing exception condition. From ACM SIGSOFT Software Engineering
Notes, vol. 6, no. 2.
- An F-14 fighter plane was lost to
uncontrollable spin, traced to tactical software. From ACM SIGSOFT Software
Engineering Notes, vol. 9, no. 5.
- A Parisian computer transforms traffic
charges into big crimes. From ACM SIGSOFT Software Engineering Notes,
vol. 14, no. 6.
- CyberSitter censors "menu */ #define"
because of the string "nu...de". From Internet Risks Forum NewsGroup
(RISKS), vol. 19, issue 56.
- In a heavily loaded computer system, a
steady stream of high-priority processes can prevent a low-priority process
from ever getting resources. Generally, one of two things will happen. Either
the process will eventually be run (at 2 A.M. Sunday, when the system is
finally lightly loaded), or the computer system will eventually crash and lose
all unfinished low-priority processes.... Rumor has it that, when they shut
down the IBM 7094 at MIT in 1973, they found a low-priority process that had
been submitted in 1967 and had not yet been run. From Silbershatz and Galvin,
pp. 142-143.
- GTE Corp. mistakenly printed 50,000
unlisted residential phone numbers and addresses in 19 directories that were
leased to telemarkteters in communities between Santa Barbara and Huntington
Beach. GTE blames the problem on a software snafu. The company faces fines of
up to 1.5 billion dollars, if found guilty of gross negligence. From
comp.dcom.telecom newsgroup (27 Apr 1998); X Telecom Digest, Volume 18,
Issue 60, Message 4 of 7.
- On Sept. 19, 1989 an overflow (of a
2-byte integer) at a Washington, DC hospital caused a computer to collapse and
forced them to do things manually.
- On Nov. 16, 1989 an overflow (of a
2-byte integer) in the Michingan Terminal System caused a computer crash in
Newcastle, followed by crashes all over the U.S.
- Midwest Telephone Company had a program
to assign telephone numbers with a $5 million annual maintenance budget. In
1981, they reported: "No more than 15 known errors remain unsolved at the end
of each month." In fact, people had stopped using the program and were
entering numbers manually, leaving the database hopelessly
outdated.
- Bank of America was forced to write off
a $60 million investment in a new software systems and reverted to its 15-year
old predecessor.
- Due to a software error, Continental
Airlines consistently undercharged for plane rentals by one day.
- SRI International's computer reset the
time by averaging 11 clocks, though one was 12 hours off.
- In 1980, the ARPAnet shut down on
account of a self-propagating error.
- Rumor has it that a military plane
flipped over when crossing the equator.
- Rumor has it that an Airbus plane
crashed into its hangar, since its onboard computer interpreted a bump as
turbulence in the air.
- Software reboot during the Apollo 11
landing forced Armstrong to manually land the lunar lander. Story
- In 1989, Swedish Gripen prototype
crashed due to new software in the fly-by-wire system. Story
- In 1995, Swedish Gripen fighter plane
crashed during air-show. Story
- Soldiers killed. Story
- Roundup of US
government Y2K bugs.