I’ve been reading some excellent material lately about the perilous task of building data warehouse capability. It cause me to reflect on my own learning from running large enterprise data warehouse, business intelligence and advanced analytics projects. So I have thrown together a Top 10 list of tips for new player. I hope you enjoy it:
1. The ETL Process:
- Will be on the critical path of your project.
- Can take 80% of your project time. This can mean you do not have enough time to build applications against the data.
2. Source data:
- You are going to find hidden problems in the systems feeding the data warehouse.
- You will often turn up the need for data not being captured by existing systems.
- Data errors come in 4 ‘types’: Incomplete — Incorrect — Incomprehensible — Inconsistent.
3. Operational support
- Data warehouses are high-maintenance systems. Reorganisations, product introductions, new pricing schemes, new customers, changes in production systems, etc. are going to affect the warehouse.
- Be prepared to support beginning users immediately and at any time especially when data is not yet central to business culture.
4. Technical
- Overhead can eat up great amounts of disk space.
- “There’s all night / weekend to load the database” have been famous last words of many a warehouse developer.
5. Responsibility
- From day one, establish that warehousing is a joint user/builder project.
- From day one, establish that maintaining data quality will be an ongoing joint user/builder responsibility.
6. Security:
- End-user data access via your warehouse requires trade-offs with data security. You can’t apply your transaction-processing system mindset
7. Your business data consumers:
- Will develop conflicting business rules.
- Will understand the “same” word differently.
- Will perform the “same” calculation differently.
8. Politics:
- For reasons of politics or overwork the feeder system programmer often can take a while to give you access to the data.
- Establishing ownership & stewardship of data (quality) will require education and diplomacy.
9. Business Value:
- If you provide a system that is fast and technically elegant but adds little value or has suspect data, you will probably lose your customer.
- The nature of data warehouse developments is often “I’ll know what I want when I see it.”
- Data warehousing best flourishes when done with an entrepreneurial orientation rather than with a reactive orientation. Traditional projects start with requirements and end with data. Data warehousing projects start with data and end with requirements.
- Feeder system owners will fear that implementing a data warehouse will limit the flexibility (or political power) that they have previously enjoyed in a single system environment. Ignore this at your peril.
10. Results (a nice problem to have)
- After end users receive query and reporting tools and see the value of the data, requests for data-scientist-assisted reports may increase rather than decrease, contrary to your ROI forecast.
I hope you find this list useful. Please add any insights you have to the comments below and feel free to get in touch if you want any advice about your specific Data Warehouse project.
Further Reading
The Politics of Data Warehousing by Marc Demarest
Data Warehouse Information Center (excellent resource!!)
Leave a Reply