A blog about SQL Server, SSIS, C# and whatever else I happen to be dealing with in my professional life.

Find ramblings

Wednesday, May 12, 2010

MelissaData AddrObj how I hate thee

I've been bitten twice in my life for using MelissaData for address cleaning/standardization. The original bug was when I was impulsive and foolish but CASSing certain formats of County Road would cause the COM component to just belly up and die. No event raised, nothing catchable, just dead. We had DTS wired up to use the AddrObj.dll and we'd be in the middle of processing tens to hundreds of thousands of rows of data and without notice we'd see the process was dead. After a few rounds of these failures, different sets of data, different addresses, I went to their website to look for help. Instead of help, I found their free address cleaner. Interesting observation: the same address that would cause our process to fail would cause their entire website to crash. After a few months of this off-and-on behaviour, we moved on to different products for address standardization.

Fast-forward to a year ago. We had the need to detect fraudulent activity based and one of those criteria involved checks being delivered to registered addresses. You can't match non-standard data so we needed to clean address data and our in-house choices were MelissaData and Trillium. I wanted Trillium to be our solution, I really did. Based on the way it's configured here however, I just couldn't make it work so I went with what I knew. Developed, tested and implemented and everything had gone peachy for almost a year until we patched on 29 April. From that point forward, we couldn't clean a day's worth of data (approximately 1k rows) without it going belly up. If you have the version 4, 0, 1, 1567 of AddrObj.dll the following code is the minimum reproduction I sent MelissaData that will cause their code to die a silent death. The address that is set in the LastLine property is amusing as it's only a single character different from the sample provided.


using ADDRESSOBJECTLib;

public class Cleaner
{

    /// <summary>
    /// A minimum reproduction of the failing code
    /// </summary>
    /// <param name="licenseKey">A valid license key for the AddrObj</param>
    /// <param name="dataFileFolder">Fully qualified path to the Data Files folder</param>
    /// <param name="lastLine"></param>
    public static void Fail(string licenseKey, string dataFileFolder, string lastLine)
    {
        AddressCheck checker = null;
        checker = new AddressCheckClass();
        checker.SetLicenseString(licenseKey);
        checker.SetPathToUSFiles(dataFileFolder);
        AddressObjectErrorCodes addressObjectErrors;
        addressObjectErrors = checker.InitializeDataFiles();
        checker.ClearProperties();
        // This is where the failure will occur
        // Try/catch will make no difference as no exception is raised
        checker.LastLine = lastLine;
    }
    
    static int Main(string[] args)
    {
        string licenseKey = string.Empty;
        string libraryPath = string.Empty;
        licenseKey = "XX-XXX-XXX";
        libraryPath = @"D:\COM Objects\Data Files";
        string lastLine = "Rancho Sa Margarita, CA 92688-221A";
        Cleaner.Fail(licenseKey, libraryPath, lastLine);
     }

}

I grew up, I contacted MelissaData and worked with their tech support rather than piss and moan. To their credit, they did respond with a pair of fixes in 2 days time. The first was a beta version of the new address object but it was too wrapped in caveats---plus I wasn't the only user of this component. Fortunately-ish, the other team experienced a failure on the same day the patch email came in so what went from a low priority production fix to ZOMG-WE-ARE-DEAD-IN-THE-WATER-EVERYONE-LOOK-HERE was able to be handled with some grace. The resolution we took was to recover the AddrObj.dll from our previous disc, replace the existing dll and re-register it. Our automated processes ran fine this morning so I am a happy camper.

No comments: