A blog about SQL Server, SSIS, C# and whatever else I happen to be dealing with in my professional life.

Find ramblings

Monday, October 8, 2018

Reading Excel files without Excel

Reading Excel files without Excel

A common problem working with Excel data is Excel itself. Working with it programatically requires an installation of Office, and the resulting license cost, and once everything is set, you're still working with COM objects which present its own set of challenges. If only there was a better way.

Enter, the better way - EPPlus. This is an open source library that wraps the OpenXml library which allows you to simply reference a DLL. No more installation hassles, no more licensing (LGPL) expense, just a simple reference you can package with your solutions.

Let's look at an example. Here's a simple spreadsheet with a header row and a row's worth of data.

For each row, after the header, I'll read the 7 columns into a list and then, since I assume the last column, BusinessFriendlyName, is unique, I'll use that as the key for my return dictionary.

using OfficeOpenXml;
...
    /// Get data from Excel worksheet
    public Dictionary<string, List<string>> GetExcelDriverData(string sourceFile)
    {
        Dictionary<string, List<string>> d = new Dictionary<string, List<string>>();
        System.IO.FileInfo fi = new System.IO.FileInfo(sourceFile);
        using (ExcelPackage ep = new ExcelPackage(fi))
        {
            ExcelWorkbook wb = ep.Workbook;
            ExcelWorksheet ws = wb.Worksheets.First();
            if (ws != null)
            {
                // 1 based array to 7, inclusive
                for (int i = ws.Dimension.Start.Row+1; i < ws.Dimension.End.Row+1; i++)
                {
                    List<string> row = new List<string>() { ws.Cells[i, 1].Value.ToString()
                    ,   ws.Cells[i, 2].Value.ToString()
                    ,   ws.Cells[i, 3].Value.ToString()
                    ,   ws.Cells[i, 4].Value.ToString()
                    ,   ws.Cells[i, 5].Value.ToString()
                    ,   ws.Cells[i, 6].Value.ToString()
                    ,   ws.Cells[i, 7].Value.ToString()
                    };
                    
                    d[ws.Cells[i, 7].Value.ToString()] = row;
                }
            }
        }
        
        return d;
    }

It's as easy as that. There are plenty of more clever implementations out there but I wanted to demonstrate a quick and easy method to read Excel from your .NET code.

No comments: