Staging metadata framework for the unknown
That's a terrible title but it's the best I got. A client would like to report out of ServiceNow some metrics not readily available in the PowerBI App. The first time I connected, I got a quick look at the Incidents and some of the data we'd be interested in but I have no idea how that data changes over time. When you first open a ticket, maybe it doesn't have a resolved date or a caused by field populated. And since this is all web service stuff and you can customize it, I knew I was looking at lots of iterations to try and keep up with all the data coming back from the service. How can I handle this and keep sane? Those were my two goals. I thought it'd be fun to share how I solved the problem using features in SQL Server 2016.
To begin, I created a database called RTMA to perform my real time metrics analysis.
CREATE DATABASE RTMA;
With that done, I created a schema within my database like
USE RTMA;
GO
CREATE SCHEMA ServiceNow AUTHORIZATION dbo;
To begin, we need a table to hold our discovery metadata.
CREATE TABLE ServiceNow.ColumnSizing ( EntityName varchar(30) NOT NULL , CollectionName varchar(30) NOT NULL , ColumnName varchar(30) NOT NULL , ColumnLength int NOT NULL , InsertDate datetime NOT NULL CONSTRAINT DF_ServiceNow_ColumnSizing_InsertDate DEFAULT (GETDATE()) ); CREATE CLUSTERED COLUMNSTORE INDEX CCI_ServiceNow_ColumnSizing ON ServiceNow.ColumnSizing;The idea for this metadata table is that we'll just keep adding more information in for the entities we survey. All that matters is the largest length for a given combination of Entity, Collection, and Column.
In the following demo, we'll add 2 rows into our table. The first batch will be our initial sizing and then "something" happens and we discover the size has increased.
INSERT INTO ServiceNow.ColumnSizing ( EntityName , CollectionName , ColumnName , ColumnLength , InsertDate ) VALUES ('DoesNotExist', 'records', 'ABC', 10, current_timestamp) , ('DoesNotExist', 'records', 'BCD', 30, current_timestamp);
Create a base table for our DoesNotExist. What columns will be available? I know I'll want my InsertDate and that's the only thing I'll guarantee to begin. And that's ok because we're going to get clever.
DECLARE @entity nvarchar(30) = N'DoesNotExist' , @Template nvarchar(max) = N'DROP TABLE IF EXISTS ServiceNow.Stage; CREATE TABLE ServiceNow.Stage ( InsertDate datetime CONSTRAINT DF_ServiceNow_Stage _InsertDate DEFAULT (GETDATE()) ); CREATE CLUSTERED COLUMNSTORE INDEX CCI_ServiceNow_Stage ON ServiceNow.Stage ;' , @Columns nvarchar(max) = N''; DECLARE @Query nvarchar(max) = REPLACE(REPLACE(@Template, ' ', @Entity), ' ', @Columns); EXECUTE sys.sp_executesql @Query, N'';
We now have a table with one column so let's look at using our synthetic metadata (ColumnSizing) to augment it. The important thing to understand in the next block of code is that we'll use FOR XML PATH('') to concatenate rows together and the CONCAT function to concatenate values together.
See more here for the XML PATH "trick"
If we're going to define columns for a table, it follows that we need to know what table needs what columns and what size those columns should be. So, let the following block be that definition.
DECLARE @Entity varchar(30) = 'DoesNotExist'; SELECT CS.EntityName , CS.CollectionName , CS.ColumnName , MAX(CS.ColumnLength) AS ColumnLength FROM ServiceNow.ColumnSizing AS CS WHERE CS.ColumnLength > 0 AND CS.ColumnLength = ( SELECT MAX(CSI.ColumnLength) AS ColumnLength FROM ServiceNow.ColumnSizing AS CSI WHERE CSI.EntityName = CS.EntityName AND CSI.ColumnName = CS.ColumnName ) AND CS.EntityName = @Entity GROUP BY CS.EntityName , CS.CollectionName , CS.ColumnName;
We run the above query and that looks like what we want so into the FOR XML machine it goes.
DECLARE @Entity varchar(30) = 'DoesNotExist' , @ColumnSizeDeclaration varchar(max); ;WITH BASE_DATA AS ( -- Define the base data we'll use to drive creation SELECT CS.EntityName , CS.CollectionName , CS.ColumnName , MAX(CS.ColumnLength) AS ColumnLength FROM ServiceNow.ColumnSizing AS CS WHERE CS.ColumnLength > 0 AND CS.ColumnLength = ( SELECT MAX(CSI.ColumnLength) AS ColumnLength FROM ServiceNow.ColumnSizing AS CSI WHERE CSI.EntityName = CS.EntityName AND CSI.ColumnName = CS.ColumnName ) AND CS.EntityName = @Entity GROUP BY CS.EntityName , CS.CollectionName , CS.ColumnName ) SELECT DISTINCT BD.EntityName , ( SELECT CONCAT ( '' , BDI.ColumnName , ' varchar(' , BDI.ColumnLength , '),' ) FROM BASE_DATA AS BDI WHERE BDI.EntityName = BD.EntityName AND BDI.CollectionName = BD.CollectionName FOR XML PATH('') ) AS ColumnSizeDeclaration FROM BASE_DATA AS BD;
That looks like a lot, but it's not. Run it and you'll see we get one row with two elements: "DoesNotExist" and "ABC varchar(10),BCD varchar(30)," That trailing comma is going to be a problem, that's generally why you see people either a leading delimiter and use STUFF to remove it or in the case of a trailing delimiter LEFT with LEN -1 does the trick.
But we're clever and don't need such tricks. If you look at the declaration for @Template, we assume there will *always* be at final column of InsertDate which didn't have a comma preceding it. Always define the rules to favor yourself. ;)
Instead of the static table declaration we used, let's marry our common table expression, CTE, with the table template.
DECLARE @entity nvarchar(30) = N'DoesNotExist' , @Template nvarchar(max) = N'DROP TABLE IF EXISTS ServiceNow.Stage; CREATE TABLE ServiceNow.Stage ( InsertDate datetime CONSTRAINT DF_ServiceNow_Stage _InsertDate DEFAULT (GETDATE()) ); CREATE CLUSTERED COLUMNSTORE INDEX CCI_ServiceNow_Stage ON ServiceNow.Stage ;' , @Columns nvarchar(max) = N''; -- CTE logic patched in here ;WITH BASE_DATA AS ( -- Define the base data we'll use to drive creation SELECT CS.EntityName , CS.CollectionName , CS.ColumnName , MAX(CS.ColumnLength) AS ColumnLength FROM ServiceNow.ColumnSizing AS CS WHERE CS.ColumnLength > 0 AND CS.ColumnLength = ( SELECT MAX(CSI.ColumnLength) AS ColumnLength FROM ServiceNow.ColumnSizing AS CSI WHERE CSI.EntityName = CS.EntityName AND CSI.ColumnName = CS.ColumnName ) AND CS.EntityName = @Entity GROUP BY CS.EntityName , CS.CollectionName , CS.ColumnName ) SELECT DISTINCT @Columns = ( SELECT CONCAT ( '' , BDI.ColumnName , ' varchar(' , BDI.ColumnLength , '),' ) FROM BASE_DATA AS BDI WHERE BDI.EntityName = BD.EntityName AND BDI.CollectionName = BD.CollectionName FOR XML PATH('') ) FROM BASE_DATA AS BD; DECLARE @Query nvarchar(max) = REPLACE(REPLACE(@Template, ' ', @Entity), ' ', @Columns); EXECUTE sys.sp_executesql @Query, N'';
Bam, look at it now. We took advantage of the new DROP IF EXISTS (DIE) syntax to drop our table and we've redeclared it, nice as can be. Don't take my word for it though, ask the system tables what they see.
SELECT S.name AS SchemaName , T.name AS TableName , C.name AS ColumnName , T2.name AS DataTypeName , C.max_length FROM sys.schemas AS S INNER JOIN sys.tables AS T ON T.schema_id = S.schema_id INNER JOIN sys.columns AS C ON C.object_id = T.object_id INNER JOIN sys.types AS T2 ON T2.user_type_id = C.user_type_id WHERE S.name = 'ServiceNow' AND T.name = 'StageDoesNotExist' ORDER BY S.name , T.name , C.column_id;Excellent, we now turn on the actual data storage process and voila, we get a value stored into our table. Simulate it with the following.
INSERT INTO ServiceNow.StageDoesNotExist (ABC, BCD) VALUES ('Important', 'Very, very important');Truly, all is well and good.
*time passes*
Then, this happens
WAITFOR DELAY ('00:00:03'); INSERT INTO ServiceNow.ColumnSizing ( EntityName , CollectionName , ColumnName , ColumnLength , InsertDate ) VALUES ('DoesNotExist', 'records', 'BCD', 34, current_timestamp);Followed by
INSERT INTO ServiceNow.StageDoesNotExist (ABC, BCD) VALUES ('Important','Very important, yet ephemeral data');To quote Dr. Beckett: Oh boy
No comments:
Post a Comment