I'm reading a book on Data Warehousing. This is in preparation for a project that I'm involved in. The project is to build a global reporting solution for IT using SQL Server 2005 Reporting Services. Here are some key words I want to remember about data warehousing:
- Use integers for dimension members to reduce size of the fact table and to gaurantee uniqueness of each member key in a dimension table.
- Use surrogate keys so that the fact table consumes less space. The ability to create multiple instances of the same product–or the same customer–is an extremely important benefit of surrogate keys, and it is particularly important in a data warehouse where you are maintaining historical information for comparison. Surrogate keys are a critical part of most data warehouse design. The foreign key in the fact table and the primary key in the dimension table are then completely under the control of the data warehouse.
- Values in a data warehouse are not changing as dynamically as they would in a production database, so storing the values redundantly is less important than is retrieving the values as quickly as possible for a report.
Understanding OLAP and Analysis Services:
Benefit of an OLAP cube over a relational database:
- Consistently fast response - prestoring calculated values.
- Metadata-based queries
- Spreadsheet-style formulas
- Use Database model OLAP instead of Spreadsheet model OLAP. Advantages and disadvantages, the biggest benefit of OLAP stored using the database model is the ability to avoid data explosion. Because you need relatively few aggregate tables to provide fast results, you can have much larger cubes with many more dimensions and attributes than by using a spreadsheet model. Perhaps the biggest disadvantage of OLAP stored by using a database model is that there is no inherent way to physically store values that are calculated using nonassociative operators.