Metadata in the Managed File Transfer Space
One of the limitations of using any file transfer protocol is describing the file that is being transferred. In early iterations of many (though not all) solutions, this was not even a consideration – if you needed to add some information, you included a header, or possibly just another file to describe the first. This was (and still is) very cumbersome, requiring a file to be opened just to determine its content.
Someone who was way ahead of the game on this issue was the ancient scholar and literary critic Zenodotus, who at around 280BC was the first librarian of Alexandria. Zenodotus organised the library by subject matter and author, but more importantly to this blog, attached a small tag to each scroll describing the content, title, subject and author. This approach meant that scholars no longer had to unroll scrolls to see what they contained, and is the first recorded use of metadata.
In IT terms, metadata came in to play in the 1970s, as an alternative method to locating data when designing databases, but it really became established as an integral part of data manipulation when XML became popular for web services.
Metadata in MFT
In terms of Managed File Transfer (MFT), if we consider a file being transferred as analogous to a scroll, we might use the metadata ‘tag’ to record things about the file – the person sending it, its content, its final recipient and perhaps a checksum hash. The possibilities for use are endless and we very quickly get to a point of wondering how we ever got by without it.
But before you start googling how to add metadata in a traditional transfer, you should be aware that the only metadata you are likely to be able to successfully access by FTP or SFTP are the filename and creation date (occasionally permissions or ownership too, depending upon the system). Obviously, this isn’t too useful when describing the data – what’s required is a little help from the file transfer vendors. This is normally delivered to end users via a webform – a HTML based form field containing several input fields completed at upload time – or via some form of API. The metadata is then stored either in XML files, or more commonly a database, from where it can be related to the files and queried as required.
What do I do with the Metadata?
Generating a webform for uploading metadata in a Managed File Transfer system is actually quite simple – the challenge comes later when trying to maintain the relationship to the file; for example, will your automation engine be able to (a) read the metadata, and (b) act upon the metadata to determine what to do with the file. It is quite straightforward to plan, but unfortunately not so simple to implement.
Some vendors have a slightly more advanced workflow methodology than others – if webforms and metadata are necessary for your environment, then it may be worthwhile looking at out-of-the-box solutions, rather than coding your own. The challenges around building, securing and maintaining a webform and workflow combination frequently outweigh the costs of such a system. Without doubt however, all the major MFT vendors provide some form of webform integration to one extent or another. Metadata is here to stay in the world of MFT, but at the time of writing this there is no industry standard, clear winner or even preference in direction.