A blog about SQL Server, SSIS, C# and whatever else I happen to be dealing with in my professional life.

Find ramblings

Loading...

Wednesday, September 23, 2009

Powershell just replaced cmd

I typically install cygwin when I get a new machine so I can have
access to unix tools. We actually have unix boxen here so I usually
ssh over to one if I need to do some utility work. Sadly, one of the
things I miss is the unix command which
http://en.wikipedia.org/wiki/Which_(Unix) It simply shows the path to
an executable that is in your path. Far better than trying to cd \;
dir /s csc.exe and then comparing those results to my $PATH to figure
out which C# compiler I'm invoking. Not that I'd so something like
that, mind you.

In powershell, it's built in! How awesome is that?

PS C:\>Get-Command csc

CommandType Name Definition
----------- ---- ----------
Application csc.exe
c:\WINDOWS\Microsoft.NET\Framework\v2.0.50727\csc.exe

Tuesday, September 22, 2009

Basics of joins

On a DDD, Date Driven Development, project, our boss has identified
that we need to make use of a junior developer for the database work.
By "make use," she means train to use SSIS but I'm pretty positive
they've never written a query before. This post and upcoming posts
will hopefully cover the information I'm trying to educate the Junior
on. When I started working with ANSI joins, I made a diagram to help
keep me straight.

The following TSQL will create concrete examples for them to work
against vs hypothetical sets


CREATE TABLE dbo.ODDS
(
col1 int NOT NULL PRIMARY KEY
)
GO

CREATE TABLE dbo.EVENS
(
col1 int NOT NULL PRIMARY KEY
)
GO
INSERT INTO
dbo.ODDS
-- Yes, I know zero isn't odd but it's illustrative
SELECT
0 AS col1
UNION ALL
SELECT
1
UNION ALL
SELECT
3
UNION ALL
SELECT
5
UNION ALL
SELECT
7
UNION ALL
SELECT
9
GO
INSERT INTO
dbo.EVENS
SELECT
0 AS col1
UNION ALL
SELECT
2
UNION ALL
SELECT
4
UNION ALL
SELECT
6
UNION ALL
SELECT
8
UNION ALL
SELECT
10
GO

SET NOCOUNT ON

Green:
SELECT
*
FROM
dbo.ODDS A
INNER JOIN
dbo.EVENS B
ON B.col1 = A.col1

Yellow:
SELECT
A.*
FROM
dbo.ODDS A
LEFT OUTER JOIN
dbo.EVENS B
ON B.col1 = A.col1
WHERE
B.col1 IS NULL

Blue:
SELECT
B.*
FROM
dbo.ODDS A
RIGHT OUTER JOIN
dbo.EVENS B
ON B.col1 = A.col1
WHERE
A.col1 IS NULL


Colorblind:
SELECT
*
FROM
dbo.ODDS A
FULL OUTER JOIN
dbo.EVENS B
ON A.col1 = B.col1

GO
DROP TABLE dbo.ODDS
GO
DROP TABLE dbo.EVENS
GO

TODO: Set code with colors

Monday, September 21, 2009

Database Snapshots

This post is a work in progress. I will edit once I'm home and not
sending these in via email

In short, I think using database snapshots coupled with unit testing
could help alleviate my challenge with database testing in that it
changes the state of the sucker as you're testing. Transactions are
limited in that the changes only exist within the context of the
current thread. Snapshots seem like they'll allow for a persisted
transaction. Once everything is validated, I can roll it back and
have not altered the state of the database. That's my theory at
least. TODO: see whether identities remain incremented ala
transaction or if it really gets rolled back.

I love TSQL but my coworkers don't share my zeal so I want to simplify
the creation of a database snapshot. I use the database created in an
earlier post, SCHEMAD in this example. If I print out the sql
statement I build up, it looks right. If I copy the output and run
it, it works fine. However, if I have the script execute it, it blows
chunks with the error about not locating it in sysdatabases

-- http://msdn.microsoft.com/en-us/library/ms175876.aspx
DECLARE @sql nvarchar(max)
SELECT
@sql = 'CREATE DATABASE ' + DB.name + '_snapshot ON ' + char(10)
+ '( NAME = ' + MF.name + char(10) + ', FILENAME = ''' +
replace(MF.physical_name, '.mdf', '.snapshot')
+ '''' + char(10)+ ')' + char(10) + ' AS SNAPSHOT OF ' + DB.name
FROM
sys.master_files MF
INNER JOIN
sys.databases DB
ON DB.database_id = MF.database_id
WHERE
MF.data_space_id = 1
AND MF.type_desc = 'ROWS'
-- create filter here to identify what database you want a snapshot of
AND DB.name = 'SCHEMAD'

PRINT @sql

EXECUTE @sql


/*
CREATE DATABASE SCHEMAD_snapshot ON
( NAME = SCHEMAD
, FILENAME = 'C:\Program Files\Microsoft SQL
Server\MSSQL.2\MSSQL\DATA\SCHEMAD.snapshot'
)
AS SNAPSHOT OF SCHEMAD
Msg 911, Level 16, State 1, Line 19
Could not locate entry in sysdatabases for database 'CREATE DATABASE
SCHEMAD_snapshot ON
( NAME = SCHEMAD
, FILENAME = 'C:\Program Files\Microsoft SQL Server\MSSQL'. No entry
found with that name. Make sure that the name is entered correctly.
*/

Dude, where's my database?

Ultra-brief post today.

If you need to know what physical file(s) are associated with what
databases, then sys.master_files is the view for you.
http://msdn.microsoft.com/en-us/library/ms186782.aspx Plenty of handy
information there but physical_name is where you can determine what
mdf file(s) a database is using and where they are located on the file
system.

Wednesday, September 16, 2009

Database schema

Brief post as I'm tired this evening. In an upcoming project, the users have requested that we stage their edits. Lead dev and myself have had similar experiences where users request to stage their data, wonderfully complex workflow engines are built and after a short period in production, they ask for the big red button to mass approve all changes.


The purpose of tonight's post is to address one of the concepts they are kicking around. Instead of having staged and live data in the same tables with an indicator (flag, publish date, whatever) they were considering having two separate databases with identical schemas to support this. They would create different user accounts for accessing the data to make sure only blessed content is served to the public. Without knowing the details, I can see why separate databases would be considered but I also mentioned that it might be an opportunity to use schemas to logically partition the database. They could still create the different accounts for access but for database backups and restores, especially as developers roll on and off the project, I thought there might be some merit to the schema approach.


The following code will create a sample database. Within that database, it will create a live and stage schema and the same table within each (works). I will load sample data into the stage version and use a stored procedure to simulate the approval of content for the productional site. TSQL is unpolished but it gets the point across.

What's not shown is the creation of user accounts and assignment of accounts/roles to schemas. It's late and unfortunately for me, I am not fluent enough in the security aspects of SQL Server to generate the TSQL without digging in to BOL.


CREATE DATABASE SCHEMAD
go
USE schemad
go
create schema stage
GO
create schema live
GO

CREATE TABLE stage.WORKS
(
works_id int identity(1, 1) NOT NULL PRIMARY KEY
, message_text varchar(max) NOT NULL
, name_last varchar(50) NULL
)
GO
CREATE TABLE live.WORKS
(
works_id int identity(1, 1) NOT NULL PRIMARY KEY
, message_text varchar(max) NOT NULL
, name_last varchar(50) NULL
)
GO
CREATE PROCEDURE
stage.Publish
(
@works_id int
)
AS
BEGIN
SET NOCOUNT ON

INSERT INTO
live.WORKS
(
message_text
, name_last
)
SELECT
W.message_text
, name_last
FROM
stage.WORKS W
WHERE
W.works_id = @works_id

END
GO
-- load up some staging data
INSERT INTO
stage.WORKS
(
message_text
, name_last
)
OUTPUT INSERTED.*
SELECT
'My fingers emit sparks of fire with expectations of my future labors' AS message_text
, 'blake' AS name_last
UNION ALL
SELECT
'The mind can make a heaven out of hell and a hell out of heaven' AS message_text
, 'milton' AS name_last
UNION ALL
SELECT
'If the fool would persist in his folly he would become wise' AS message_text
, 'blake' AS name_last
UNION ALL
SELECT
'Excess of sorrow laughs. Excess of joy weeps' AS message_text
, 'blake' AS name_last

GO
-- publish some blake
EXECUTE stage.publish 1
EXECUTE stage.publish 3

GO
SELECT * FROM stage.works
SELECT * FROM live.works

GO
/*
-- cleanup
drop table live.works
drop table stage.works
drop procedure stage.publish
drop schema stage
drop schema live


*/

Tuesday, September 15, 2009

Twitter friend or foe

I'm still trying to "get" Twitter. MySpace I never got. FaceBook, I get. One thing I notice about Twitter is the spammers. I had been diligent in blocking them but today I got to wondering if I could programatically determine whether these followers were spammers. In the beginning, all of few months ago, it was easier. If someone started following me with a follower to friend ratio that was just ridiculous, they were probably a spammer. Now I'm seeing more clever approaches. Spam creators are making lots of accounts and linking between them to bring the friend to follow ratio to a more natural balance.

What this code will hopefully do is attempt to determine if a follower is a spammer or not. Beyond simple friend:follower:tweet calculations, I think looking at bio information and actual tweets, looking for overlaps between links in both places would be a good start. Additionally, repeated text in tweets (excluding standard noise like RT, etc) might be a good indicator of a spammer. For now, it just computes the aforementioned ratios between your followers and prints it out.

If nothing else, this code was fun because I've now actually used LINQ, LINQ to XML to be precise. Expert, not in the slightest but it at least seemed a much more pleasant way to iterate through an XML doc. I still have a lot of reading to do on it as this is based on 5 min tutorial I found. Code below was written in VS 2010 beta on Windows 7 RC1 but should work on VS 2008. Obviously, you need to set your userid and password to get your follower info. Link to the Yedda Twitter wrapper



namespace FriendOrFoe
{
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Xml.Linq;

using Yedda;

/// <summary>
/// This is a rough attempt to see if I can identify spammers that follow me on
/// Twitter @billinkc.
/// </summary>
public class Program
{
/// <summary>
/// As it's a first pass, I'm just doing everything
/// in Main until I make sure I understand how to work with
/// twitter service calls as well as look for patterns between
/// spammers and non-spammers
/// </summary>
/// <param name="args">list of cammand-line arguments</param>
static void Main(string[] args)
{
Yedda.Twitter t = null;
t = new Twitter();
string username = "yourAcccountHere";
string password = "password";
string results = string.Empty;
results = t.GetFollowers(username, password, Twitter.OutputFormatType.XML);
XDocument doc = null;
doc = XDocument.Parse(results);
var xmlSource = XDocument.Parse(results);

var query = from p in xmlSource.Elements("users").Elements("user")
select p;

Twit twit;
Console.WriteLine("User\tFriend:Follower\tFriend:Tweet\tFollower:Tweet");
foreach (var record in query)
{
twit = new Twit(record.Element("name").Value, Convert.ToInt32(record.Element("id").Value), Convert.ToInt32(record.Element("friends_count").Value), Convert.ToInt32(record.Element("followers_count").Value), Convert.ToInt32(record.Element("statuses_count").Value));
Console.WriteLine(twit);
}
Console.Read();
}
}

/// <summary>
/// A person on twitter and the metrics surrounding them
/// </summary>
struct Twit
{
/// <summary>
/// A users name
/// </summary>
public string userName;

/// <summary>
/// The twitter id for an account
/// </summary>
public int id;

/// <summary>
/// The number of friends a user has
/// </summary>
public int friendCount;

/// <summary>
/// The count of followers for a user
/// </summary>
public int followerCount;

/// <summary>
/// The count of status updates. This should be a bigint BTW
/// </summary>
public int tweetCount;

/// <summary>
/// Non-default constructor, does nothing special
/// </summary>
/// <param name="userName">User name for a twitter account</param>
/// <param name="id">id representing the account</param>
/// <param name="friendCount">count of friends</param>
/// <param name="followerCount">count of followers</param>
/// <param name="tweetCount">count of posts/tweets</param>
public Twit(string userName, int id, int friendCount, int followerCount, int tweetCount)
{
this.userName = userName;
this.id = id;
this.friendCount = friendCount;
this.followerCount = followerCount;
this.tweetCount = tweetCount;
}

/// <summary>
/// Show the user and guess whether they are a spammer.
/// TODO: play with numbers. Get more info to look at bio & content of tweets
/// esp. for repeated crap
/// </summary>
/// <returns></returns>
public override string ToString()
{
double friendToFollower = 0d;
double friendToTweet = 0d;
double followerToTweet = 0d;
friendToFollower = this.friendCount / ((this.followerCount == 0) ? 1 : this.followerCount * 1.0);
friendToTweet = this.friendCount / ((this.tweetCount == 0) ? 1 : this.tweetCount * 1.0);
followerToTweet = this.followerCount / ((this.tweetCount == 0) ? 1 : this.tweetCount * 1.0);
//return string.Format("User: {0} Friend:Follower {1} friend:tweet {2} follower:tweet {3}", this.userName, friendToFollower, friendToTweet, followerToTweet);
return string.Format("User: {0} {1:F} {2:F} {3:F}", this.userName, friendToFollower, friendToTweet, followerToTweet);
}

}
}


Oh and as analysis of my own account, it looks like if the Friend:Tweet is higher than 3, it might be a good candidate for blocking. Next step will be to set up a honeypot account and start posting in trending topics.

Monday, September 14, 2009

Visual studio build macros

We defined Post-build events for one of our SSIS components .NET projects. Nothing terribly fancy, it just registers the signed assembly in the GAC and pushes it to the Tasks and Pipleine folders for use in the toolbox. As part of our migration from VS 2005 to 2008, I had to rework the events as they were hard coded for 05.

If one right clicks on a project, you can define your pre and post build events. The links below further link to more information if that doesn't give you enough to go on. My challenge was given all the possible macros, which ones did I want? Was it DevEnvDir, FrameworkDir, or FrameworkSDKDir. In the event builder dialogue, there is a Macros >> which results in the following but it only lists what is defined

Sample values for Macros for Build Commands and Properties Your
mileage may vary

http://msdn.microsoft.com/en-us/library/c02as0cs(VS.71).aspx
http://msdn.microsoft.com/en-us/library/c02as0cs.aspx

This was our original post-build scneario. It would force the custom component dll to be registered with the GAC. That DLL contains both data flow task components as well as control flow items so for developers to use it, it needs to exist in the PipelineComponents folder as well as the Tasks folder. We copy it to the Framework folder so it can be picked up for Script tasks and finally, we copy push a template package into the VS folder so it shows as template. As part of the migration from SQL Server 2005 to 2008, the only real change we had to make from a post-build perspective was to make the 90 folders 100. However, the virtual I was working on didn't have gacutil in the path. It seemed silly to update the path to navigate to the executable and so I started digging through the available macros



gacutil -iF "$(TargetPath)"
copy /y "$(TargetPath)" "$(ProgramFiles)\Microsoft SQL Server\90\DTS\PipelineComponents"
copy /y "$(TargetPath)" "$(ProgramFiles)\Microsoft SQL Server\90\DTS\Tasks"
copy /y "$(TargetPath)" "C:\WINDOWS\Microsoft.NET\Framework\v2.0.50727"
copy /y "$(SolutionDir)\SQL\SSIS\PackageTemplate.dtsx" "$(DevEnvDir)\PrivateAssemblies\ProjectItems\DataTransformationProject\DataTransformationItems"

becomes

"$(FrameworkSDKDir)bin\gacutil.exe" -u "$(TargetName)"
"$(FrameworkSDKDir)bin\gacutil.exe" -i "$(TargetFileName)"
copy /y "$(TargetPath)" "$(ProgramFiles)\Microsoft SQL Server\100\DTS\PipelineComponents"
copy /y "$(TargetPath)" "$(ProgramFiles)\Microsoft SQL Server\100\DTS\Tasks"
copy /y "$(TargetPath)" "C:\WINDOWS\Microsoft.NET\Framework\v2.0.50727"
copy /y "$(SolutionDir)\SQL\SSIS\PackageTemplate.dtsx" "$(DevEnvDir)\PrivateAssemblies\ProjectItems\DataTransformationProject\DataTransformationItems"


Using this as my Post-Event scenario

echo RemoteMachine = $(RemoteMachine)
echo References = $(References)
echo ConfigurationName = $(ConfigurationName)
echo PlatformName = $(PlatformName)
echo Inherit = $(Inherit)
echo NoInherit = $(NoInherit)
echo StopEvaluating = $(StopEvaluating)
echo ParentName = $(ParentName)
echo RootNameSpace = $(RootNameSpace)
echo IntDir = $(IntDir)
echo OutDir = $(OutDir)
echo DevEnvDir = $(DevEnvDir)
echo InputDir = $(InputDir)
echo InputPath = $(InputPath)
echo InputName = $(InputName)
echo InputFileName = $(InputFileName)
echo InputExt = $(InputExt)
echo ProjectDir = $(ProjectDir)
echo ProjectPath = $(ProjectPath)
echo ProjectName = $(ProjectName)
echo ProjectFileName = $(ProjectFileName)
echo ProjectExt = $(ProjectExt)
echo SolutionDir = $(SolutionDir)
echo SolutionPath = $(SolutionPath)
echo SolutionName = $(SolutionName)
echo SolutionFileName = $(SolutionFileName)
echo SolutionExt = $(SolutionExt)
echo TargetDir = $(TargetDir)
echo TargetPath = $(TargetPath)
echo TargetName = $(TargetName)
echo TargetFileName = $(TargetFileName)
echo TargetExt = $(TargetExt)
echo VSInstallDir = $(VSInstallDir)
echo VCInstallDir = $(VCInstallDir)
echo FrameworkDir = $(FrameworkDir)
echo FrameworkVersion = $(FrameworkVersion)
echo FrameworkSDKDir = $(FrameworkSDKDir)
echo WebDeployPath = $(WebDeployPath)
echo WebDeployRoot = $(WebDeployRoot)
echo SafeParentName = $(SafeParentName)
echo SafeInputName = $(SafeInputName)
echo SafeRootNamespace = $(SafeRootNamespace)
echo FxCopDir = $(FxCopDir)

generates the following


------ Build started: Project: MacroExposition, Configuration: Debug
Any CPU ------
MacroExposition ->
C:\sandbox\MacroExposition\MacroExposition\bin\Debug\MacroExposition.exe
echo RemoteMachine =
echo References =
echo ConfigurationName = Debug
echo PlatformName = AnyCPU
echo Inherit =
echo NoInherit =
echo StopEvaluating =
echo ParentName =
echo RootNameSpace = MacroExposition
echo IntDir =
echo OutDir = bin\Debug\
echo DevEnvDir = C:\Program Files\Microsoft Visual Studio 8\Common7\IDE\
echo InputDir =
echo InputPath =
echo InputName =
echo InputFileName =
echo InputExt =
echo ProjectDir = C:\sandbox\MacroExposition\MacroExposition\
echo ProjectPath =
C:\sandbox\MacroExposition\MacroExposition\MacroExposition.csproj
echo ProjectName = MacroExposition
echo ProjectFileName = MacroExposition.csproj
echo ProjectExt = .csproj
echo SolutionDir = C:\sandbox\MacroExposition\
echo SolutionPath = C:\sandbox\MacroExposition\MacroExposition.sln
echo SolutionName = MacroExposition
echo SolutionFileName = MacroExposition.sln
echo SolutionExt = .sln
echo TargetDir = C:\sandbox\MacroExposition\MacroExposition\bin\Debug\
echo TargetPath =
C:\sandbox\MacroExposition\MacroExposition\bin\Debug\MacroExposition.exe
echo TargetName = MacroExposition
echo TargetFileName = MacroExposition.exe
echo TargetExt = .exe
echo VSInstallDir =
echo VCInstallDir =
echo FrameworkDir = c:\WINDOWS\Microsoft.NET\Framework\v2.0.50727
echo FrameworkVersion =
echo FrameworkSDKDir = C:\Program Files\Microsoft Visual Studio 8\SDK\v2.0\
echo WebDeployPath =
echo WebDeployRoot =
echo SafeParentName =
echo SafeInputName =
echo SafeRootNamespace =
echo FxCopDir =


RemoteMachine =
References =
ConfigurationName = Debug
PlatformName = AnyCPU
Inherit =
NoInherit =
StopEvaluating =
ParentName =
RootNameSpace = MacroExposition
IntDir =
OutDir = bin\Debug\
DevEnvDir = C:\Program Files\Microsoft Visual Studio 8\Common7\IDE\
InputDir =
InputPath =
InputName =
InputFileName =
InputExt =
ProjectDir = C:\sandbox\MacroExposition\MacroExposition\
ProjectPath = C:\sandbox\MacroExposition\MacroExposition\MacroExposition.csproj
ProjectName = MacroExposition
ProjectFileName = MacroExposition.csproj
ProjectExt = .csproj
SolutionDir = C:\sandbox\MacroExposition\
SolutionPath = C:\sandbox\MacroExposition\MacroExposition.sln
SolutionName = MacroExposition
SolutionFileName = MacroExposition.sln
SolutionExt = .sln
TargetDir = C:\sandbox\MacroExposition\MacroExposition\bin\Debug\
TargetPath = C:\sandbox\MacroExposition\MacroExposition\bin\Debug\MacroExposition.exe
TargetName = MacroExposition
TargetFileName = MacroExposition.exe
TargetExt = .exe
VSInstallDir =
VCInstallDir =
FrameworkDir = c:\WINDOWS\Microsoft.NET\Framework\v2.0.50727
FrameworkVersion =
FrameworkSDKDir = C:\Program Files\Microsoft Visual Studio 8\SDK\v2.0\
WebDeployPath =
WebDeployRoot =
SafeParentName =
SafeInputName =
SafeRootNamespace =
FxCopDir =
========== Build: 1 succeeded or up-to-date, 0 failed, 0 skipped ==========

And for what it's worth, this is what the virtual kicked out

echo RemoteMachine = 
echo References =
echo ConfigurationName = Debug
echo PlatformName = AnyCPU
echo Inherit =
echo NoInherit =
echo StopEvaluating =
echo ParentName =
echo RootNameSpace = WR.Common.SSIS
echo IntDir =
echo OutDir = .\bin\Debug\
echo DevEnvDir = D:\Program Files\Microsoft Visual Studio 9.0\Common7\IDE\
echo InputDir =
echo InputPath =
echo InputName =
echo InputFileName =
echo InputExt =
echo ProjectDir = C:\Src\SalesReporting\Src\CommonSSIS\
echo ProjectPath = C:\Src\SalesReporting\Src\CommonSSIS\CommonSSIS.csproj
echo ProjectName = CommonSSIS
echo ProjectFileName = CommonSSIS.csproj
echo ProjectExt = .csproj
echo SolutionDir = C:\src\SalesReporting\Src\CommonSSIS\
echo SolutionPath = C:\src\SalesReporting\Src\CommonSSIS\CommonSSIS.sln
echo SolutionName = CommonSSIS
echo SolutionFileName = CommonSSIS.sln
echo SolutionExt = .sln
echo TargetDir = C:\Src\SalesReporting\Src\CommonSSIS\bin\Debug\
echo TargetPath = C:\Src\SalesReporting\Src\CommonSSIS\bin\Debug\WRCommonSSIS.dll
echo TargetName = WRCommonSSIS
echo TargetFileName = WRCommonSSIS.dll
echo TargetExt = .dll
echo VSInstallDir =
echo VCInstallDir =
echo FrameworkDir = C:\WINDOWS\Microsoft.NET\Framework\v2.0.50727
echo FrameworkVersion =
echo FrameworkSDKDir = C:\Program Files\Microsoft SDKs\Windows\v6.0A\
echo WebDeployPath =
echo WebDeployRoot =
echo SafeParentName =
echo SafeInputName =
echo SafeRootNamespace =
echo FxCopDir =
RemoteMachine =
References =
ConfigurationName = Debug
PlatformName = AnyCPU
Inherit =
NoInherit =
StopEvaluating =
ParentName =
RootNameSpace = WR.Common.SSIS
IntDir =
OutDir = .\bin\Debug\
DevEnvDir = D:\Program Files\Microsoft Visual Studio 9.0\Common7\IDE\
InputDir =
InputPath =
InputName =
InputFileName =
InputExt =
ProjectDir = C:\Src\SalesReporting\Src\CommonSSIS\
ProjectPath = C:\Src\SalesReporting\Src\CommonSSIS\CommonSSIS.csproj
ProjectName = CommonSSIS
ProjectFileName = CommonSSIS.csproj
ProjectExt = .csproj
SolutionDir = C:\src\SalesReporting\Src\CommonSSIS\
SolutionPath = C:\src\SalesReporting\Src\CommonSSIS\CommonSSIS.sln
SolutionName = CommonSSIS
SolutionFileName = CommonSSIS.sln
SolutionExt = .sln
TargetDir = C:\Src\SalesReporting\Src\CommonSSIS\bin\Debug\
TargetPath = C:\Src\SalesReporting\Src\CommonSSIS\bin\Debug\WRCommonSSIS.dll
TargetName = WRCommonSSIS
TargetFileName = WRCommonSSIS.dll
TargetExt = .dll
VSInstallDir =
VCInstallDir =
FrameworkDir = C:\WINDOWS\Microsoft.NET\Framework\v2.0.50727
FrameworkVersion =
FrameworkSDKDir = C:\Program Files\Microsoft SDKs\Windows\v6.0A\
WebDeployPath =
WebDeployRoot =
SafeParentName =
SafeInputName =
SafeRootNamespace =
FxCopDir =

Saturday, September 12, 2009

Project Euler, Problem 1

Problem 1 for Project Euler was to "Add all the natural numbers below one thousand that are multiples of 3 or 5." Logically, it's a simple enough problem: for each number between 3 and 999, if it is evenly divisible by 3 or 5, we will need to put in our running total. The challenge for me is getting into the whole mindset of functional programming. I can't wait to compare this first solution to what I'll hopefully be writing in a few months.



#light

open System
// open Microsoft.FSharp.Collections.Set

// If we list all the natural numbers below 10 that are
// multiples of 3 or 5, we get 3, 5, 6 and 9.
// The sum of these multiples is 23.
//
// Find the sum of all the multiples of 3 or 5 below 1000.


let rec threeMultiples n =
if n = 3 then [3]
else if n % 3 = 0 then threeMultiples(n-1) @ [n]
else threeMultiples(n-1)

let rec fiveMultiples n =
if n = 5 then [5]
else if n % 5 = 0 then fiveMultiples(n-1) @ [n]
else fiveMultiples(n-1)

// Define the upper limit of where we are looking
let ceiling = 1000 - 1

// generate a list of all the natural numbers less than ceiling
// evenly divisible by 3 or 5.
// TODO: Understand how to pass 2 params to recursive fxn
let tempList = threeMultiples ceiling @ fiveMultiples ceiling

// Convert our list to a set to eliminate the duplicate values
let setList = Set.of_list tempList

// Convert the set back to a list so we can accumulate them
let sumList = Set.to_list setList

// Sum them numbers up
// You//s a big fine function,
// won//t you sum them numbers up
// Original lyrics by Juvenile (1999)
// Insipid comments like the above are why I should be sleeping
let sums = List.sum sumList

System.Console.WriteLine(String.Format("Sum of the natural numbers divisible by 3 and 5 under {0} is {1}", ceiling, sums))

Now I can sleep

Friday, September 11, 2009

F# declaration loop-de-loop

Just a quick note as it's late and I should have been in bed a few hours ago, but

let ceiling = 1000 -1

fails with "This value is not a function and cannot be applied" Why? The whitespace. As it parses now, F# is attempting to call the 1000 function with a parameter of -1*. If either the space between the 0 and - is removed or if an extra space is inserted between the - and the 1, then it will work. F# doesn't know what to do with negative one, but does know how to add negative one to 1000. Any of the following three would work

let ceiling1 = 1000-1
let ceiling2 = 1000 - 1
let ceiling3 = 1000 + -1
*At least, that's my guess. Seems reasonable enough at this point

F you

I'm talking about F#, what'd you think I was talking about? I can't tell if it's, Functional programming itself or F# that's causing me more pain. I wasn't blessed to deal with Lisp or a derivative in college so it's only as an adult that I'm trying to wrap my brain around it. Imperative and declarative programming, I think I have down solid. Python and this Johnny-come-lately C# led me to believe I should be okay with the concepts of functional programming. Visual Studio 2010 has full-fledged support for F# and a nifty tutorial which compiles and runs just fine. However, my ability to crib from that to build out my own solutions for Project Euler has been less than spectacular. After starting complex, I've distilled my problem down to "how do I append another element to a list." The damnable thing, is that they have an example right there in the tutorial of how to do it in F#.


#light
// List of best friends.
let bffs = [ "Susan"; "Kerry"; "Leslie"; "Maria" ]

// Bind newBffs to a new list that has "Ginger" as its first element.
let newBffs = "Ginger" :: bffs


Monkey see, monkey do but I wanted to work with integers and append an integer to the end.


// List first fibbonaci numbers.
let fib = [ 1; 1; 2; 3; 5; 8 ]

// Bind fibNext to a new list that has 13 as its last element.
let fibNext = fib :: 13


That throws a syntax error though: "This expression has type int but is here used with type int list list" Some hits with Google for that but no one seems to have as elementary of a sample as I'm working with. Logically, it makes me think the type is wrong with 13.


// List first fibbonaci numbers.
let fib = [ 1; 1; 2; 3; 5; 8 ]
let thirteen = [ 13 ]

// Bind fibNext to a new list that has 13 as its last element.
let fibNext = fib :: thirteen

That results in the even more maddening error of "Type mismatch. Expecting a int list list but given a int list. The type 'int list' does not match the type 'int'" Sooo, it's expecting an int list list but I only gave it an int list... However, it's at least list-like so thanks to this link I at least see that I can concat two lists with the @ operator so the following compiles so it must work

let fibNext = fib @ thirteen

Blessedly, it does and I'm back in to defiling F# with my unwashed hands. TODO: find and read an article like Dive into Python for F#

Thursday, September 10, 2009

First Powershell

Yes, PowerShell does appear to be as simple as I've heard about. What I do like and is contrary to my previous Hack and Slash post is the ability to just fire up the interpreter and "play." Tab completion allows for easy exploration of functionality. I'll have to read an actual book on it at some point in the future to plumb the depths of it as it seems like something everyone will need in their toolkit.

When new developers start or if we get a new machine here, there are some manual setup steps required. Trivial stuff but that can make for a good case for me to sample PowerShell.



# PowerShell 1.0 script
# 2009-09-10 bfellows
#
# This script is designed to facilitate the setup of new developer machines
# It creates two new machine level variables for CVS and SSIS and
# ensures the path to the CVS executable is in the system path
#

# Create variables and assign default values
$cvsPath = "C:\Program Files\CVSNT\"
$cvsRoot = "hostname=cvs.domain.com:/usr/cvs"
$ssisConfig = "Provider=SQLNCLI.1;Data Source=localhost;Integrated Security=SSPI;Initial Catalog=SYSDEVDB"
$pathDelimiter = ";"

# Snag the existing path into a dictionary
$path = Get-ChildItem Env:path

# Set the environment variables which you couldn't have guessed from the
# method call
[Environment]::SetEnvironmentVariable("SSIS_CONFIG", $ssisConfig, "Machine")
[Environment]::SetEnvironmentVariable("cvsroot", $cvsRoot, "Machine")

# If CVSNT isn't in the path (this assumes it has been installed)
if (! $path.Value.Contains($cvs))
{
# Append the cvs path, prefaced with a path delimiter
[Environment]::SetEnvironmentVariable($path.Name, $path.Value + $pathDelimiter + $cvsPath, "Machine")
}

Wednesday, September 9, 2009

Hack and slash

I had assumed everyone created a proof of concept/scratch pad/hack about project but after a conversation in IRC today, I realized that wasn't the case. In Python and other interpreted languages, if I was working with a concept or library I was unfamiliar with, I'd invoke the interpreter and "play" with it until I figured out. In the .NET world, I create a LanguageOrTechnologyHackAndSlash project in my own private sandbox whereever I go. The project typically looks like a single class (Program/Module1) with lots of static methods. While you can create a single .NET solution and have seperate projects per language, I prefer to keep them in different solutions lest I completely bone something and can't get it to compile ever again. YMMV.

Structure
It's as simple as C:\sandbox\CSHackAndSlash, C:\sandbox\VBHackAndSlash, C:\sandbox\SSISHackAndSlash, C:\sandbox\FHackAndSlash, etc Within the startup project, my Main (or equivalent) method is littered with calls to breakpoints, method calls active and commented out. out

Why
The best reason I have for doing this is history. I lose post-it notes. Things written in the interpreter go away when you close it out. It also saves me from re-inventing the wheel on things I can't seem to keep straight in my head---

What goes in it
Everything. My C# one is littered with samples of how do I keep initialize an array in .NET, what's the preferred way of dealing with case-insensitive string equality, what's the regex method I like .Match, .Matches or .IsMatch? My VB one covers the snippets of code I use in SSIS script tasks or sanity checks that my code is doing what I expect, but the data is suspect.

What's missing
I've never compared my home version, it might be interesting to have them synched and/or checked into a version control system but that may be overkill.

Sample


namespace CSHackAndSlash
{
using System;
using System.Collections.Generic;
using System.Text;
using System.Text.RegularExpressions;

using WR.NetCommon.Util;

class Program
{
/// <summary>
/// When this boolean property is set to True,
/// the VerifyAddress method will always return
/// the preferred city name for the submitted
/// address, regardless of whether or not an approved
/// vanity name was sent to the method.
/// </summary>
private bool usePreferredCity;

/// <summary>
/// Gets or sets a value indictating whether we should use
/// a vanity name for a city. e.g. Westwood KS vs Mission KS
/// </summary>
public bool UsePreferredCity
{
get { return this.usePreferredCity; }
set { this.usePreferredCity = value; }
}

/// <summary>
/// Major challenge I see with enums is we lose leading zeros
/// </summary>
public enum Rank { RegionalVicePresident = 128, ManagingPrincipal = 234 };

static void Main(string[] args)
{
bool foo = false;
// foo = Program.stringMatches("I don't think of myself as a lion. You might as well though, I have a mighty roar", "i DON'T THINK OF MYSELF AS A LION. yOU MIGHT AS WELL THOUGH, i HAVE A MIGHTY ROAR");
Regex re = new Regex(@"\d");
Match m = null;
Match m2 = null;
m = re.Match("abc main");
m2 = re.Match("200 main");
re.Mat
int foo = -1;
foo = m.Length;
Program.TestProdWebService();
}

static void TestLoadWebService()
{
loadserver.AddressCleanerService service = null;
service = new CSHackAndSlash.loadserver.AddressCleanerService();
service.Timeout = 60 * 60 * 1000; // in milliseconds
service.Credentials = System.Net.CredentialCache.DefaultCredentials;
loadserver.Entity result = null;
result = service.CleanAddress("1318 truman boolevard", "rolla mo 65400");

// try the big CASS
string fname = @"\\loadserver\AddressCleanerServiceData\BulkCleanerValidation.txt";
bool results;
try
{
results = service.CleanBulkAddresses(fname);
}
catch (Exception ex)
{
;
}
}

static void TestProdWebService()
{
prodserver.AddressCleanerService service = null;
service = new CSHackAndSlash.prodserver.AddressCleanerService();
service.Timeout = 60 * 60 * 1000; // in milliseconds
service.Credentials = System.Net.CredentialCache.DefaultCredentials;
prodserver.Entity result = null;
result = service.CleanAddress("1318 truman boolevard", "rolla mo 65400");

// try the big CASS
string fname = @"\\prodserver\AddressCleanerServiceData\BulkCleanerValidation.txt";
bool results;
try
{
results = service.CleanBulkAddresses(fname);
}
catch (Exception ex)
{
;
}
}

static void iMain(string[] args)
{
// Program.stringManip();
// Program.stringcomparengrams();
// Program.GenerateAddressDuplicateCode(null, null);

string value = string.Empty;
String objValue = null;
object valueObj = null;
Object objValueObj = null;
/*
unsafe
{
int* addrOfValue;
int* addrOfObjValue;
int* addrOfObjValue2;
int* addrOfValueObj;
int* addrOfObjValueObj;

value = "Hello world";
addrOfValue = &value;
objValue = new String(value.ToCharArray());
addrOfObjValue = &objValue;
objValue = value;
addrOfObjValue2 = &objValue;
valueObj = new object();
addrOfValueObj = &valueObj;
objValueObj = new Object();
addrOfObjValueObj = &objValueObj;

System.Console.WriteLine("The address stored in addrOfValue: {0}", addrOfValue->ToString());
System.Console.WriteLine("The address stored in addrOfObjValue: {0}", addrOfObjValue->ToString());
System.Console.WriteLine("The address stored in addrOfObjValue2: {0}", addrOfObjValue2->ToString());
System.Console.WriteLine("The address stored in addrOfValueObj: {0}", addrOfValueObj->ToString());
System.Console.WriteLine("The address stored in addrOfObjValueObj: {0}", addrOfObjValueObj->ToString());
}
*/

}
/// <summary>
/// Generate an address duplicate code which serves as a hash function
/// for standardized addresses.
/// </summary>
/// <param name="postalCode">5 digit postal code</param>
/// <param name="addressLine1">The first line of an address</param>
/// <returns>A hash of input parameters</returns>
public static string GenerateAddressDuplicateCode(string postalCode, string addressLine1)
{
string soundex = string.Empty;
StringBuilder digits = new StringBuilder(postalCode);
StringBuilder alphas = new StringBuilder();
List<char> digitList = new List<char>(new char[] { '0', '1', '2', '3', '4', '5', '6', '7', '8', '9' });
List<char> alphaList = new List<char>(new char[] { 'a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'k', 'l', 'm', 'n', 'o', 'p', 'q', 'r', 's', 't', 'u', 'v', 'w', 'x', 'y', 'z', 'A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I', 'J', 'K', 'L', 'M', 'N', 'O', 'P', 'Q', 'R', 'S', 'T', 'U', 'V', 'W', 'X', 'Y', 'Z' });

// hash = zip5 + line1.digits + Soundex(line1.alphas)
if (addressLine1 != null)
{
foreach (char c in addressLine1.ToCharArray())
{
if (digitList.Contains(c))
{
digits.Append(c);
}
else if (alphaList.Contains(c))
{
alphas.Append(c);
}
}
}

soundex = Matching.Soundex(alphas.ToString());

// Couldn't compute the soundex
if (soundex.StartsWith("00"))
{
soundex = string.Empty;
}

return string.Format("{0}{1}", digits.ToString(), soundex);
}

public static void stringcomparengrams()
{
string line1 = "quidditch A";
string line2 = "uqidditch a";

string actual1 = string.Empty;
string actual2 = string.Empty;
List<NGram> list1 = null;
List<NGram> list2 = null;
int matchCount = 0;

// at times like this, I really miss python's simple split(' ').join('') syntax
actual1 = String.Concat(line1.Split(new char[] { ' ', '\t', '\n' }));
actual2 = String.Concat(line2.Split(new char[] { ' ', '\t', '\n' }));

// Console.WriteLine(string.Format("Before: {0}\tAfter: {1}", string1, actual1));
list1 = new List<NGram>(Matching.GetNgrams(actual1, 3));
list2 = new List<NGram>(Matching.GetNgrams(actual2, 3));
List<NGram> matchList = null;
NGram element = list1[0];
list2.Find(delegate(NGram a) { return a.Equals(element); });

Matching.StringCompareNgrams(line1, line2);
Console.Read();
}

public static void stringManip()
{
string m_Recipient = string.Empty;
string _pagerAddress = string.Empty;
m_Recipient = "dbasDistlist@domain.tld;DotNet@domain.tld;PageWeb1@domain.tld";
_pagerAddress = "pageweb1@domain.tld";

if (m_Recipient.ToLower().Contains(_pagerAddress))
{
List<string> _list = null;
_list = new List<string>(m_Recipient.ToLower().Split(';'));
_list.Remove(_pagerAddress);
m_Recipient = string.Join(";", _list.ToArray());
}
Console.WriteLine(m_Recipient);
Console.Read();
}

public static bool stringMatches(string string1, string string2)
{
if ((string1 == null) && (string2 == null))
{
return true;
}

if ((string1 == null) (string2 == null))
{
return false;
}

// The most simple of a match, same string
if (string1.Equals(string2, StringComparison.OrdinalIgnoreCase))
{
return true;
}

// use advanced algs, double metaphone, ngrams, etc
;
}
}
}