Sunday, November 04, 2007

Data Warehouse Project Lifecycle by Dylan Wan

Here is the typical lifecycle for data warehouse deployment project:

0. Project Scoping and Planning
Project Triangle - Scope, Time and Resource.
* Determine the scope of the project - what you would like to accomplish? This can be defined by questions to be answered. The number of logical star and number of the OLTP sources
* Time - What is the target date for the system to be available to the users
* Resource - What is our budget? What is the role and profile requirement of the resources needed to make this happen.

1. Requirement
* What are the business questions? How does the answers of these questions can change the business decision or trigger actions.
* What are the role of the users? How often do they use the system? Do they do any interactive reporting or just view the defined reports in guided navigation?
* How do you measure? What are the metrics?

2. Front-End Design
* The front end design needs for both interactive analysis and the designed analytics workflow.
* How does the user interact with the system?
* What are their analysis process?

3. Warehouse Schema Design
* Dimensional modeling - define the dimensions and fact and define the grain of each star schema.
* Define the physical schema - depending on the technology decision. If you use the relational tecknology, design the database tables

4. OLTP to data warehouse mapping
* Logical mapping - table to table and column to column mapping. Also define the transformation rules
* You may need to perform OLTP data profiling. How often the data changes? What are the data distribution?
* ETL Design -include data staging and the detail ETL process flow.

5. Implementation
* Create the warehouse and ETL staging schema
* Develop the ETL programs
* Create the logical to physical mapping in the repository
* Build the end user dashboard and reports

6. Deployment
* Install the Analytics reporting and the ETL tools.
* Specific Setup and Configuration for OLTP, ETL, and data warehouse.
* Sizing of the system and database
* Performance Tuning and Optimization

7. Management and Maintenance of the system
* Ongoing support of the end-users, including security, training, and enhancing the system.
* You need to monitor the growth of the data.

Article from

No comments: