Windows network shares with python
Backstory
On a daily basis, we receive data extracts from a mainframe. They provide a header and data file for whatever the business users want to explore. This client has lots of old data ferreted away and they need to figure out if there's value in it. Our job is to consume the header files to drop and create tables in SQL Server and then populate with actual data. The SQL is trivial -
CREATE TABLE Foo (Col1 varchar(255), ColN varchar(255)); BULK INSERT Foo FROM 'C:\sourceFile.csv' WITH (FIRSTROW=1,ROWTERMINATOR='\n',FIELDTERMINATOR='|');
Let's make this harder than it should be
Due to ... curious permissions and corporate politics, the SQL Server service account could only read files via a network share (\\Server\Share\Input\File.csv), never you no mind the fact that path was really just D:\Share\Input. A local drive but permissions were such that we couldn't allow the service account to read from the drive. Opening a network share up and letting the account read from that - no problem.
What are the shares?
That's an easy question to answer, because I knew the answer. net share
. I coded up a simple parser and all was well and good until I ran it on the server which had some really, long share names and/or the Resource was long. Like this
Share name Resource Remark ------------------------------------------------------------------------------- C$ C:\ Default share IPC$ Remote IPC ADMIN$ C:\WINDOWS Remote Admin DEV2016 \\?\GLOBALROOT\Device\RsFx0410\Super. The output of net share is quasi fixed width and it just wraps whatever it needs to onto the next line/column.\DEV2016 SQL Server FILESTREAM share RidiculouslyLongShareName C:\users\bfellows\Downloads The command completed successfully.
What are the sharesv2
Windows Management Instrumentation to the rescue! WMIC.exe /output:stdout /namespace:\\root\cimv2 path Win32_Share GET Name, Path
That's way better, sort of
Name Path ADMIN$ C:\WINDOWS C$ C:\ DEV2016 \\?\GLOBALROOT\Device\RsFx0410\Originally, that command ended with GET * which resulted in a lot more information being returned than I needed. The devil though, is that the output width is dependent upon the source data. If I remove the network share for my RidiculouslyLongShareName and rerun the command, I get this output\DEV2016 IPC$ RidiculouslyLongShareName C:\users\bfellows\Downloads
Name Path ADMIN$ C:\WINDOWS C$ C:\ DEV2016 \\?\GLOBALROOT\Device\RsFx0410\It appears to be longest element +2 spaces for this data but who knows what the real encoding rule is. The good thing is, that while variable, the header rows gives me enough information to slice up the data as needed.\DEV2016 IPC$ Users C:\Users
This needs to run anywhere
The next problem is that this process in Dev runs on D:\Share but in QA is is on the I:\datafiles\instance1 and oh by the way, there are two shares for the I drive \\qa\Instance1 (I:\datafiles\instance1) and \\qa\datafiles. (I:\datafiles) In the case where there are multiple shares, if there's one for the folder where the script is running, that's the one we want. Otherwise, it's probably the "nearest" path which I interpreted as having the longest path.
Code good
Here's my beautiful, hacky python. Wherever this script runs, it will then attempt to render the best share path to the same location.
import os import subprocess def _generate_share_dictionary(headerRow): """Accepts a variable width, white space delimited string that we attempt to divine column delimiters from. Returns a dictionary of field names and a tuple with start/stop slice positions""" # This used to be a more complex problem before I realized I didn't have # to do GET * in my source. GET Name, Path greatly simplifies # but this code is generic so I keep it as is header = headerRow fields = header.split() tempOrds = {} ords = {} # Populate the temporary ordinals dictionary with field name and the # starting, zero based, ordinal for it. # i.e. given #Name Path #01234567890123456789 # we would expect Name:0, Path:9 for field in fields: tempOrds[field] = headerRow.index(field) # Knowing our starting ordinal positions, we will build a dictionary of tuples # that contain starting and ending positions of our fields for iter in range(0, len(fields) -1): ords[fields[iter]] = (tempOrds[fields[iter]], tempOrds[fields[iter+1]]) # handle the last element ords[fields[-1]] = (tempOrds[fields[-1]], len(headerRow)) return ords def get_network_shares(): """Use WMIC to get the full share list. Needed because "net share" isn't parseable""" _command = r"C:\Windows\System32\wbem\WMIC.exe /output:stdout /namespace:\\root\cimv2 path Win32_Share GET Name, Path" #_command = r"C:\Windows\System32\wbem\WMIC.exe /output:stdout /namespace:\\root\cimv2 path Win32_Share GET *" _results = subprocess.check_output(_command, shell=True).decode('UTF-8') _headerRow = _results.splitlines()[0] headerOrdinals = _generate_share_dictionary(_headerRow) _shares = parse_network_shares_name_path(headerOrdinals, _results) return _shares def parse_network_shares_name_path(header, results): """Rip apart the results using our header dictionary""" _shares = {} #use the above to slice into our results #skipping first line since it is header for _line in results.splitlines(): if _line: _shares[_line[header["Name"][0]: header["Name"][1]].rstrip()] = _line[header["Path"][0]: header["Path"][1]].rstrip() return _shares def translate_local_path_to_share(currentPath): """Convert the supplied path to the best match in the shares list""" shareName = "" defaultShare = "" shares = get_network_shares() # find the first share match if currentPath in shares.values(): shareName = [key for key, value in shares.items() if value == currentPath][0] else: #see if we can find a partial match # favor longest path best = "" pathLength = 0 for share, path in shares.items(): # path can be empty due to IPC$ share if path: # Is the share even applicable? if path in currentPath: # Favor the non default/admin share (DriveLetter$) if share.endswith('$'): defaultShare = currentPath.replace(path[:-1], share) else: if len(path) > pathLength: shareName = currentPath.replace(path[:-1], share) # No other share was found if (defaultShare and not shareName): shareName = defaultShare x = os.path.join(r"\\" + os.environ['COMPUTERNAME'], shareName) print("Current folder {} maps to {}".format(currentPath, x)) return os.path.join(r"\\" + os.environ['COMPUTERNAME'], shareName) def main(): current = os.getcwd() #current = "C:\WINDOWS" share = translate_local_path_to_share(current) print("{} aka {}".format(current, share)) if __name__ == "__main__": main()
Takeaways
You probably won't ever need all of the above code to be able to swap out a local path for a network share using python but by golly if you do, have fun. Also, python is still my most favorite language, 14 years running.
No comments:
Post a Comment