news / tech talk
Database Needs
by Lee LeClair04/20/2007
As seen in Inside Tucson Business
- Arizona Daily Star
- BizPlanIt
- Linux World Expo
- 40 under 40
- They're on the A-List
- Dotche system built by Ephibian
- AzBusiness
- Arizona Daily Star
- Arizona Daily Star
- Phoenix Business Journal
- Ranking Arizona
- The Arizona Republic
- Hostingtech.com
- American City Business Journals Inc.
- AZtechBiz
- Inside Tucson Business
- Arizona Business Gazette
- Inside Tucson Business
- Fiesta Mall
- Arizona Daily Star
- .com Success!
- Business Wire
- Buck's Woodside Menu
- CRN
- Arizona Daily Star
- LocalBusiness.com
- The Business Journal - Phoenix
- Phoenix Business Journal
- LocalBusiness.com
- Business Wire
- Inside Tucson Business
- internet.com
- AzBusiness
- AZtechBiz
- designshops.com
- AZtechBiz
- BizAZ
- Virtualized Cloud
- Collaboration and Communication
- Personally Identifiable Information
- Cyberwarfare
- iPad and E-Readers
- Trusted Platform Module
- Smartphone Data Security
- Cyber-Espionage
- DTNs
- Have a Plan
- Cloud Computing - Part 2
- Impact of Technology on Existing...
- Data Archiving
- Mobile Telephony - Part 2
- Cloud Computing
- Social Networks
- Password Management
- Netbooks
- Microtargeting
- Packet Analysis
- IP v6
- Surge Protection
- Traveling Safely
- Thin Client
- Uptime
- Mobile Telephony
- Know Thy Programs
- Voice Over IP - Part 3
- Google Apps
- Virtual Computing
- Securing Users
- Simple Desktop Management
- Service Oriented Architecture
- Light-based Communication
- Data Mining
- Small Business Architecture
- Voice Over IP - Part 2
- Business Automation
- Database Needs
- DMZs
- CPUs
- SPAM & Botnets
- Security Testing
- Customer Advocacy
- Laptop Security
- Windows Vista
- Large Scale Deployment
- Network Access Control
- Generator Use
- Uninterrupted Power Supplies
- Web Site Security
- Blu-ray vs. HD-DVD
- Dual-Core Processors
- Business Security
- AJAX
- 3G Mobile Internet
- Apple Intel Processors
- Entertainment Tech
- Cafe Wireless
- Commercial Hosting
- Gaming Consoles
- Voice Over IP
- Blogging
- Is WI-FI Secure?
- OpenDocument Format
- Allured Publishing Changes Name to...
- Computer Model Can Help Prevent War?
- Defense contractors run gamut from...
- ASU gears on-site construction...
- The Cleveland Foundation Selects...
- Global Partners Join Forces to Speed...
- Intuit Completes Acquisition of...
- Strategy unveiled on how tobacco tax...
- Gaiam's, Real Goods' revenues increase...
- LSST Awarded Time on TeraGrid
- Aldine Independent School District...
- Miraval featured in Natural Solutions...
- Ventana Medical Systems Joins TSIA to...
- UA $3 Million Bioterrorism Grant...
- Arizona Center for Integrative...
Businesses that operate with data of any complexity use databases to store their data. Databases form the heart of information processing for virtually all applications, web-based or otherwise. Given that this is the case, one might think that application programmers are very familiar with the concept of databases and their proper design. That is typically not the case. Part of the reason is that programming and database concepts are taught as separate subjects at most universities and often students concentrate on programming and view databases simply as crude data repositories.
This is unfortunate as is the fact that in many information businesses, there is a clear distinction between “programmers” and “database guys”. Since programming deals with data manipulation and databases store data, these two disciplines are fundamentally linked. Yet without properly combining the two, it is easy to create a data nightmare that is desperately difficult to fix.
Data integrity is paramount. To that end, concentrate on good data design for your purposes. You may have heard terms like On-line Transaction Processing (OLTP), star schemas, data cubes, and data warehouses. These are all different aspects of data that are dependent on what you need. OLTP data is what most people use in their businesses. It maintains data with the greatest emphasis on data integrity so that it can be reliably used to power online applications, shopping carts, inventory systems, financial systems, etc.
Data warehousing is a huge topic but is generally related to storing all your data and than “mining” it for useful nuggets of veins of information (to coin mining terms). Mining is a special form of decision support and basically means that you are manipulating your data store for reports of particular types. Given that your purposes are different for reporting than for transactions, your database design is different to support fast reporting rather than absolutely consistent data transactions. So where an OLTP database ensures that you only store one copy of data to ensure integrity, reporting databases (star or cube designs) typically store multiple copies of data in special sets to support very fast data reads and report generation.
Relational database management systems are complex applications in their own right. They employ the concept of ACID (Atomicity, Consistency, Isolation, and Durability) to maintain integrity and have many features for security roles and permissions. Real databases use Structured Query Language (SQL) to communicate with programs and direct user queries. By “real” databases, I mean databases like Oracle RDBMS, IBM’s DB2, Microsoft’s SQL Server, PostgreSQL, and MySQL. There are many other databases included embedded systems used behind the scenes in accounting and other types of applications. If possible, I recommend using a separate database rather than an embedded one if there will be any possible need to access the data directly. This is because separate databases provide an exposed interface that you can use if you need to customize or integrate the data with other systems.
In developing an application, first work through what the application needs to do and thereby what data needs to be collected and stored. Think ahead to the reports to be gathered; be as detailed as possible about all the types of data to be gathered as well as how the data is inter-related. From this, a good data design will begin with an Entity-Relationship (ER) diagram that captures information about the data (e.g., what type of data, how many characters, etc.) and how the various types of data relate to one another (e.g., one-to-one, one-to-many, and so on). The ER diagram forms the basis of the DB tables and programmers can then begin implementation according to this central roadmap. Use integrity constraints to ensure the data design is solid. Once your application begins collecting data and populating the database, making changes is not trivial. Therefore, it definitely pays to think through the data design up-front as much as possible. Finally, use the database features to ensure its integrity rather than relying on programs to ensure integrity. That way, the database can be accessed independently or integrated with other systems without compromising data integrity.
Lee Le Clair is the CTO at Ephibian. His Tech Talk column appears the third week of each month in Inside Tucson Business
This is unfortunate as is the fact that in many information businesses, there is a clear distinction between “programmers” and “database guys”. Since programming deals with data manipulation and databases store data, these two disciplines are fundamentally linked. Yet without properly combining the two, it is easy to create a data nightmare that is desperately difficult to fix.
Data integrity is paramount. To that end, concentrate on good data design for your purposes. You may have heard terms like On-line Transaction Processing (OLTP), star schemas, data cubes, and data warehouses. These are all different aspects of data that are dependent on what you need. OLTP data is what most people use in their businesses. It maintains data with the greatest emphasis on data integrity so that it can be reliably used to power online applications, shopping carts, inventory systems, financial systems, etc.
Data warehousing is a huge topic but is generally related to storing all your data and than “mining” it for useful nuggets of veins of information (to coin mining terms). Mining is a special form of decision support and basically means that you are manipulating your data store for reports of particular types. Given that your purposes are different for reporting than for transactions, your database design is different to support fast reporting rather than absolutely consistent data transactions. So where an OLTP database ensures that you only store one copy of data to ensure integrity, reporting databases (star or cube designs) typically store multiple copies of data in special sets to support very fast data reads and report generation.
Relational database management systems are complex applications in their own right. They employ the concept of ACID (Atomicity, Consistency, Isolation, and Durability) to maintain integrity and have many features for security roles and permissions. Real databases use Structured Query Language (SQL) to communicate with programs and direct user queries. By “real” databases, I mean databases like Oracle RDBMS, IBM’s DB2, Microsoft’s SQL Server, PostgreSQL, and MySQL. There are many other databases included embedded systems used behind the scenes in accounting and other types of applications. If possible, I recommend using a separate database rather than an embedded one if there will be any possible need to access the data directly. This is because separate databases provide an exposed interface that you can use if you need to customize or integrate the data with other systems.
In developing an application, first work through what the application needs to do and thereby what data needs to be collected and stored. Think ahead to the reports to be gathered; be as detailed as possible about all the types of data to be gathered as well as how the data is inter-related. From this, a good data design will begin with an Entity-Relationship (ER) diagram that captures information about the data (e.g., what type of data, how many characters, etc.) and how the various types of data relate to one another (e.g., one-to-one, one-to-many, and so on). The ER diagram forms the basis of the DB tables and programmers can then begin implementation according to this central roadmap. Use integrity constraints to ensure the data design is solid. Once your application begins collecting data and populating the database, making changes is not trivial. Therefore, it definitely pays to think through the data design up-front as much as possible. Finally, use the database features to ensure its integrity rather than relying on programs to ensure integrity. That way, the database can be accessed independently or integrated with other systems without compromising data integrity.
Lee Le Clair is the CTO at Ephibian. His Tech Talk column appears the third week of each month in Inside Tucson Business