Stop the Data Warehouse Creep
Posted by J Singh | Filed under ETL process, Hadoop, Big Data, Data Warehouse
Have you experienced this?
- Your data warehouse takes about 8 hours to load the cube. The cube is updated weekly; 8 hours per week is a small price to pay and everyone is happy.
- It is missing a few data elements that another group of analysts really needs.
You add those and now the cube takes 9 hours to load. No big deal.
- Time passes, the business is doing well, the size of the data quadruples. It now takes 36 hours per week to load the cube.
- Time passes, some of the data elements are not needed any more but it is too hard to take them out of the process — it continues to take 36 hours.
- You add yet another group of analysts as user, a few more data elements, it now takes 44 hours per week!
- You get the picture… the situation gets more and more precarious over time.
We hope the dam holds!
Here's an idea from Read Write Web:
One example … was a credit card company working to implement fraud detection functionality. A traditional SQL data warehouse is more than likely already in place, and it may work well enough but without enough granularity for ...