Nearly five years ago Computers of South Dublin was approached about converting a complicated financial spreadsheet into a web-based application that would reproduce the data entry forms and formulae of the spreadsheet, allowing each logged-in user to modify only data that they “owned.” I responded with a proposal to implement this bespoke web design using the PHP programming language for the visual interface and mathematical calculations, with the “cell” data stored as tables in the MySQL database.
The project investors, being cautious, were interested in a layman's explanation of why a database implementation (and why MySQL in particular) would be desirable when it seemed like their current implementation with Excel should have been sufficient. The answer to this question became an explanation of why distributed databases are necessary, particularly as the single-source data model fails to grow across a network.
My response was presented in a simple COSD white paper in 2008, and is presented unedited here. In redesigning this web site in 2012, I am reluctant to replace it with an updated article, since:
- the material below is still accurate today, when accounting for the chaotic behaviour of businesses
- the advent of cloud services has introduced the Amazon RDS service which is based primarily and originally on MySQL
- this single page was the most linked-to page on the old site… in the last four years I've had numerous unsolicited requests for similar "layman's explanations" and I look forward to finally indulging them.
Robert Phair, Director
Computers of South Dublin
06 December 2007
Why is using a database better than Excel?
Excel was developed to allow a single computer user to maintain data in one file at a time on one computer. It has no built-in ability to share its data with other computers or programs. In order for Excel data to be modified, a computer operator would have to be continuously present at the computer where the spreadsheet file was accessible. Using a database for storage eliminates this problem because data in the database can be modified automatically (by programs) as well as manually.
Ensuring that the Excel file is accessible across a computer network, in order to eliminate the dependence upon a single computer, does not provide a complete solution. For instance, any time a spreadsheet file is opened over the network, the file appears “locked“ with respect to all other computers to avoid possible conflicts or data loss due to concurrent changes. A database implementation, since all pieces of data can be modified independently, does not have this limitation.
Database systems containing the acronym SQL are designed for accessibility over a network, so a database set up on an Internet-accessible machine could be accessed and modified by any machine on the Internet with the correct access information (i.e., the server password). Systems exist to provide interaction between Excel files and network-based applications, but they are expensive, inherently unreliable, and notoriously difficult to implement… as opposed to a system like MySQL, which is only a single software component containing all database and network features.
Finally, databases have capabilities that do not exist in Excel: for instance, they provide rapid search capabilities through large amounts of data, returning records that meet potentially any criteria, and can create new databases (such as summaries) from the contents of existing databases, simplifiying the generation of human-readable queries and reports.
Why is MySQL the best database software to use?
The two most popular choices for web server SQL implementations are Microsoft SQL Server and MySQL. Microsoft SQL Server is only supported and improved upon by a single company, while MySQL was developed by a consortium of software developers and is guided in its ongoing development by both industry requirements and compliance with open standards.
This results in MySQL being an overwhelming favourite for application developers, with the following benefits:
- MySQL is provided free on all UNIX web servers and is intelligently integrated with other tools, such as Perl and PHP (also free), that can represent its data in a web-based environment.
- MySQL therefore involves no overhead in installation time, and very little ongoing maintenance time relative to Microsoft SQL Server.
- MySQL is simple to program, having both the best online documentation of any available database system and also discussion forums where developers can obtain quick solutions to problems.
- MySQL is the most reliable database, both in the author's experience and as evidenced in its preferred use by Google, Yahoo, NASA, and much of the worldwide financial community.
- MySQL is available free of charge (only the Enterprise version requires the payment of licensing fees).
- Last Updated: 04 October 2012