The Challenge of Big Data – It’s more than just big files!
Big Data is a term that creeps up a lot these days and its meaning can be deceptive. Often data is thought of as just “Big Data” when the file size hits a certain size. But in reality the picture is less about size and more about complexity. Big data files do not need to be measured in terabytes, or even gigabytes, but the complexity of data and the inter-relationships between it and the other data sources inside an organisation, can make it more valuable than the raw data alone. Decreasing the time taken to process the information also means decisions can be made quicker or thought about longer and the value of the data increases.
Gartner predicts 2016 seeing big data moving on from the ingestion of data to the automating of the analytics and artificial intelligence (AI) being used to leverage the power of data, but before any of this can happen you need to have the data in the right location and format first.
Getting Big Data in…
The faster you can get data into your organisation, the sooner it can be analysed. In order to speed up the process of receiving the data into your organisation, several managed file transfer solutions are including proprietary, high-speed file transfer protocols. These are based around UDP streams or parallel TCP connections to increase bandwidth utilisations to over 90%.
The new protocols enhance data transfer rates significantly, enabling gigabytes of data to be delivered in less than a minute across the world, making for some impressive headlines. This technology is still in its early days, meaning than no open protocols offer this increased utilisation, opting for one approach over another is a personal preference.
Getting Big Data sorted…
Getting the data into your organisation is all well and good, but it is only half of the challenge. All the data needs to be analysed before it can become useful, however it arrives in your managed file transfer system from a variety of sources, in different formats and, almost invariably, not the format your central data analysis tool needs.
Two simple enhancements that can increase the efficiency and speed at which you ingest your data; firstly pushing data when it’s ready instead of waiting for it to be collected, and secondly triggering events when the file is received by your managed file transfer system.
Remove another step from the process…
Implementing a managed file transfer solution, which has the ability to stream files to a target server, provides productivity gains over traditional store and forward style workflows. By writing a large data set directly onto the intended target system, you’re able to remove another step in the process.
Once the latency has been pared back, the next stumbling block is getting the data into a useable format. There are literally hundreds of data standards and even the most common of these are often “augmented” with extra data my specific applications.
Integrating some form of data translation, often by post processing scripts or applications is a common approach. This works well until the next upgrade changes the “standard” slightly and the translation script needs to be edited or even re-written. Modern managed file transfer solutions provide the ability to transform data to be presented to the target system in a format that it recognises and can process. These can be simple XLS to XML conversion or much more complex EDI and database translation.
A growing requirement of managed file transfer…
The world of managed file transfer has evolved to enable companies that need to move big data, to do so as efficiently as possible. Streamlining the delivery of data (of varying types, sizes and structures), from external trading partners, onto internal big data analytics solutions, is becoming a much more common requirement from our customers.
If you’ve a Big Data file transfer project and would like our pre-sales and technical experts assistance, contact us here or call +44 (0) 20 7118 9640.