A blog about SQL Server, SSIS, C# and whatever else I happen to be dealing with in my professional life.

Find ramblings

Friday, July 9, 2010

I don't know datawarehouse modeling

I am finally getting serious about trying to understand BI, Analysis services, etc so I've tried to transform our existing data into something that can be easily consumed by SSAS. In general, reading a technical book before I have a vague foundation for the material is a strategy for failure so I've been trying to cobble together a enough of a star schema with "real" data for my books to make sense

Problem

Our sales areas are defined by Regions. Regions have Divisions and Divisions may have Districts. If they don't have districts, in the data at least, they will still have a fake district, 0. Analysis of the data is typically done by areas which says Dimension. Some sample data is below

region  division  district  rdd
02      03        00        0203-00
02      05        00        0205-00
02      05        01        0205-01
02      05        11        0205-11

While I'm sure it's a "it depends" question, would I want to model this as multiple tables or can I make it work with a single dimension? If I make it multiple tables, does this look like I'm going down the right path?

Any comments, pointers, help, etc are greatly appreciated.

1 comment:

Tom Huguelet said...

Although people say "never snowflake" or "always snowflake" I don't think it makes much difference. I tend to model my relational schemas to make ETL easiest, then merge to a star schema via SQL views or the DSV for SSAS consumption. As long as you can have an attribute value at every level in this hierarchy, the way you've modeled it here will work.

If you haven't already, check out Kimball's Data Warehouse Toolit 2nd edition. It's the magna carta of dimensional design.

HTH, @tomhug