The Challenge of Big Data – It’s more than just big files!

The Challenge of Big Data – It’s more than just big files!

Big Data is a term that creeps up a lot these days and its meaning can be deceptive. Often data is thought of as just “Big Data” when the file size hits a certain size. But in reality the picture is less about size and more about complexity. Big data files do not need to be measured in terabytes, or even gigabytes, but the complexity of data and the inter-relationships between it and the other data sources inside an organisation, can make it more valuable than the raw data alone. Decreasing the time taken to process the information also means decisions can be made quicker or thought about longer and the value of the data increases.

Gartner predicts 2016 seeing big data moving on from the ingestion of data to the automating of the analytics and artificial intelligence (AI) being used to leverage the power of data, but before any of this can happen you need to have the data in the right location and format first.

Big Data Characteristics

Big Data Characteristics – The 4 V’s

Getting Big Data in…

The faster you can get data into your organisation, the sooner it can be analysed. In order to speed up the process of receiving the data into your organisation, several managed file transfer solutions are including proprietary, high-speed file transfer protocols. These are based around UDP streams or parallel TCP connections to increase bandwidth utilisations to over 90%.

The new protocols enhance data transfer rates significantly, enabling gigabytes of data to be delivered in less than a minute across the world, making for some impressive headlines. This technology is still in its early days, meaning than no open protocols offer this increased utilisation, opting for one approach over another is a personal preference.

Getting Big Data sorted…

Getting the data into your organisation is all well and good, but it is only half of the challenge. All the data needs to be analysed before it can become useful, however it arrives in your managed file transfer system from a variety of sources, in different formats and, almost invariably, not the format your central data analysis tool needs.

Two simple enhancements that can increase the efficiency and speed at which you ingest your data; firstly pushing data when it’s ready instead of waiting for it to be collected, and secondly triggering events when the file is received by your managed file transfer system.

Remove another step from the process…

Implementing a managed file transfer solution, which has the ability to stream files to a target server, provides productivity gains over traditional store and forward style workflows. By writing a large data set directly onto the intended target system, you’re able to remove another step in the process.

Once the latency has been pared back, the next stumbling block is getting the data into a useable format. There are literally hundreds of data standards and even the most common of these are often “augmented” with extra data my specific applications.

Integrating some form of data translation, often by post processing scripts or applications is a common approach. This works well until the next upgrade changes the “standard” slightly and the translation script needs to be edited or even re-written. Modern managed file transfer solutions provide the ability to transform data to be presented to the target system in a format that it recognises and can process. These can be simple XLS to XML conversion or much more complex EDI and database translation.

A growing requirement of managed file transfer…

The world of managed file transfer has evolved to enable companies that need to move big data, to do so as efficiently as possible. Streamlining the delivery of data (of varying types, sizes and structures), from external trading partners, onto internal big data analytics solutions, is becoming a much more common requirement from our customers.

If you’ve a Big Data file transfer project and would like our pre-sales and technical experts assistance, contact us here or call +44 (0) 20 7118 9640.

Download a Comparison of 8 Leading Managed File Transfer Solutions!

 

MFT_Comparison Guide Img

In this essential pack you’ll also find…

 

  • Key features and frequently asked questions

  • Other business policies that will need to be considered

  • Access to additional resources

  • Side by side comprehensive comparison

    * Updated to include new vendors (October 2015)

Some Thoughts on TCP Speeds

Some Thoughts on TCP Speeds

As a consultant in File Transfer technologies, a common complaint that I find myself having to address is the speed that a file travels at between two servers. Generally speaking, many people expect that if they have two servers exchanging files on a dedicated traffic-free 1 Gbps line, then their transfer speed should be somewhere close to this.

TCP Vs UDP

One of the first things to consider is the way that TCP works compared to UDP. No matter which protocol is used, data is broken into packets when being sent to the receiving computer. When UDP (User Datagram Protocol) is used, the packets are sent ‘blind’; the transfer continues regardless of whether data is being successfully received or not. This potential loss may result in a corrupted file – in the case of a streamed video this could be some missing frames or out of sync audio, but generally will require a file to be resent in its entirety. The lack of guarantee makes the transfer fast, but unless combined with rigorous error checking (as per several large-file-transfer vendors) it is often unsuitable for data transfers.

In contrast, TCP (Transmission Control Protocol) transfers data in a carefully controlled sequence of packets; as each packet is received at the destination, an acknowledgement is sent back to the sender. If the sender does not receive the acknowledgement in a certain period of time, it simply sends the packet again. To protect the sequence, further packets cannot be sent until the missing package has been successfully transmitted and an acknowledgment received.

Deliverability over speed / Calculating the Bandwidth Delay Product 

This emphasis on guarantee rather than speed brings with it a certain degree of delay however; we can see this by using a simple ping command to establish the round trip time (RTT) – the greater the distance to be covered, the longer the RTT. The RTT can be used to calculate the Bandwidth Delay Product (BDP) which we will need to know when calculating network speeds. BDP is the amount of data ‘in-flight’ and is found by multiplying the Bandwidth by the delay, so a round trip time of 32 milliseconds on a 100Mbps line gives a BDP of 390KB (data in transit).

Window Scaling

The sending and receiving computers have a concept of windows (‘views’ of buffers) which control how many packets may be transmitted before the sender has to stop transfers. The receiver window is the available free space in the receiving buffer; when the buffer becomes full, the sender will stop sending new packets. Historically, the value of the receiver window was set to 64KB as TCP headers used a 16 bit field to communicate the current receive windows size to the sender; however it is now common practice to dynamically increase this value using a process called Window Scaling. Ideally, the Receive Window should be at least equal in size to the BDP.

TCP speed fluctuations

The congestion window is set by the sender and controls the amount of data in flight. The aim of the congestion window is to avoid network overloading; if there are no packets lost during transmission then the congestion window will continually increase over the course of the transfer. However, if packets are lost or the receiver window fills, the congestion window will shrink in size under the assumption that the capacity of either the network or receiver has been reached. This is why you will often see a TCP download increase in speed then suddenly slow again.

TCP Speeds Diagram

A quick calculation…

One point to remember is that when talking about bandwidth, we tend to measure in bits; when referring to storage (window size or BDP) we are measuring in bytes. Similarly, remember to make allowance for 1Mb = 1000Kb, but 1MB=1024KB.

So, given this, a 1Gbps connection with a 60 ms round trip time gives a BDP of 7.15 MB (1000*60/8/1.024/1.024). As I mentioned, to fully utilise the 1Gbps connection, we must increase the Receiver Window to be at least equal to the BDP. The default (non-scaling) value of 64 KB will only give us a throughput of 8.74 Mbps: 64/60*8*1.024 = 8.738Mbps

So what can you do to speed up the transfer?

Logically, you would probably want to have the largest Receive Window possible to allow more bandwidth to be used. Unfortunately, this isn’t always a great idea; assuming that the receiver is unable to process the data as fast as it arrives you may potentially have many packets queued up for processing in the Receiving Buffer – but if any packet has been lost, all subsequent packets in the receive buffer will be discarded and resent by the sender due to the need to process them in sequence.

You also need to consider the abilities of the computer at the other end of the connection – both machines need to be able to support window scaling and selective acknowledgements (as per RFC1323).

Another option that you can investigate is the ability of several products to perform multithreading. Multithreaded transfers theoretically move quicker than single threaded transfers due to the ability to send multiple separate streams of packets; this negates somewhat the delays caused by having to resend packets in the event of loss. However the transfers may still be impacted by full receive windows or disk write speeds; in addition any file that has been sent via multiple threads needs to be reassembled on an arrival, requiring further resources. In general, most large-file transfer software is written around multithreading principles, or a blend of UDP transfer with TCP control.

Finally, consider other network users – when using large Receive Windows, remember that as the amount of data in transit at any time increases, you may encounter network usage spikes or contention between traffic.

 

If you have any questions about the speed of your file transfers or your chosen file transfer technology and infrastructure design give our team of experts a call on 0207 118 9640.

Download a Comparison of 8 Leading Managed File Transfer Solutions!

 

MFT_Comparison Guide Img

In this essential pack you’ll also find…

 

  • Key features and frequently asked questions

  • Other business policies that will need to be considered

  • Access to additional resources

  • Side by side comprehensive comparison

    * Updated to include new vendors (October 2015)

Impact of Brexit on the GDPR

Impact of Brexit on the GDPR

The opening statement of Information Commissioner Sir Christopher Graham’s last annual report talked about “responding to new challenges, and preparing for big changes, particularly in the data protection and privacy field.” Delivering his speech in the early aftermath of Brexit, everyone was keen to get his view on the implications for the roll out of the General Data Protection Regulation (GDPR).

Prior to Brexit

In April of 2016, after two years of debating, the final terms of the European GDPR were agreed. The legislation comes into effect for member states in May 2018 and includes key changes such as:

  • The right to be forgotten
  • New stricter conditions for the adequate protection of file transfers
  • Privacy notices for individuals on how their data is handled
  • Tighter legislation around active consent for processing data
  • And a shared liablity for breaches between data controllers and data processors.

The change that many CIOs will be concerned about is the increase in sanctions for data breach, which have increased to 4% of annual global turnover.

GDPR-reform

Moving forward

When asked about the uncertainty, the Commissioner stated “We now need to consider the impact of the referendum on UK data protection regulation. It is very much the case that the UK has a history of providing legal protection to consumers around their personal data which precedes EU legislation by more than a decade, and goes beyond current EU requirements.” He stressed that “Having clear laws with safeguards in place is more important then ever given the growing digital economy, and we will be speaking to parts of the government to present our view that reform of the UK law remains necessary.”

But will EU GDPR still effect us?

The changes in EU Legislation are due to come into effect in May 2018. As the debate over Article 50 continues, CIOs face on-going uncertainty. However, whether the UK is still a member of the EU or not, the new rules will still apply to many organisations. The newly agreed scope states that the law will apply to non-EU companies that are offering goods and services to EU citizens. Any UK organisation selling in Europe will still need to comply with GDPR.

In closing, the Commissioner reiterated that the ICO would continue to make sure that the current standard of excellence remains intact. “We must maintain the confidence of businesses and of consumers. The ICO stands ready to enforce the rules that remain and make the case for the highest standards going forward.”

Whatever the law is called, data protection is not going away.

If you’re unsure how any of the current or upcoming data protection legislation effects your businesses’ file transfer requirements give our team of experts a call on 0207 118 9640.

Download a Comparison of 8 Leading Managed File Transfer Solutions!

 

MFT_Comparison Guide Img

In this essential pack you’ll also find…

 

  • Key features and frequently asked questions

  • Other business policies that will need to be considered

  • Access to additional resources

  • Side by side comprehensive comparison

    * Updated to include new vendors (October 2015)

Managed File Transfer Versus Middleware

Managed File Transfer Versus Middleware

Managed File Transfer and Middleware have both undergone a period of evolution in the past few years. Historically speaking, the early days of both can be easily traced back to the need to move data between various parts of a computer network, generally over simple protocols like FTP or RCP. As a consequence and especially as organisations began to move away from legacy environments, many networks contained an inordinate number of FTP servers, frequently with an unknown array of FTP clients pushing and pulling data in an often uncontrolled fashion.

Middleware stands up…

This became a standard argument for switching to using a middleware product – taking back control of your network and the data that crossed it. Most early middleware systems used a hub and spoke affair and provided a central point where all data would arrive and depart from. Additionally, the notion of data transformation during transit became popular, rather than the more traditional manipulation during processing at source or target system. A ‘code once use many times’ approach appeared for interfaces, allowing for a reduction in development costs, and the only limitation appeared to be the ever-growing range of available connectors.

The beginning of MFT…

FTP servers didn’t go away however; instead organisations began to centralise their FTP sites and a newer smarter generation of FTP server software began to appear. These early versions of Managed File Transfer quickly developed a common set of standard features – encryption, automation, protocol support and user management.

Which is which?

As both middleware and Managed file transfer systems matured, the boundaries between them began to diminish somewhat, with Managed file transfer performing some middleware functions and vice versa. Now we have reached a point where the practical differences have become a little fuzzy, however it shouldn’t be impossible to follow some simple guidelines to decide upon whether an architect should be following a middleware or a managed file transfer approach.

A good starting point is data transformation. Traditionally this falls squarely within the realms of middleware; however there are Managed file transfer solutions which can offer this feature well enough to be considered. In contrast, most middleware does not provide an FTP interface for end-users, relying instead on web services for input or FTP clients for output. An organisation therefore has to review its requirements – do they need an Managed file transfer solution with some middleware functionality, or middleware with some Managed file transfer?

Managed File Transfer Functionality

Middleware Functionality

While trying to avoid generalisations, here are some things to consider that Managed File Transfer solutions provide and middleware ones generally don’t (or at least not well):

  • Enterprise File Sync and Share – the process of sharing data by sending a hyperlink via email is not well supported by middleware
  • Large File Transfer – Very large files are not suitable for transformation and therefore are not often considered by middleware vendors
  • File repository – Managed file transfer systems normally provide a repository of data for download, often encrypted
  • Home folder management – mostly, if a middleware system permits users to have home folders, these have to be manually created
  • Development and Deployment – on the whole, managed file transfer allows for faster design and rollout of interfaces than middleware, which often requires full development teams

Conversely, Middleware can provide functionality that Managed file transfer often struggles with, for example:

  • Mapping, database lookups and transformation – middleware supports complex mapping operations, either custom or using internationally recognised templates.
  • Customisable interfaces – middleware provides a framework for development, meaning bespoke designs can be implemented.
  • Peer-to-peer relationships – generally only available in specialised Managed file transfer products (using agents for example), peer-to- peer interfaces are becoming more popular, especially when making use of cloud technology.
  • Adapter support – most middleware products provide adapters which allow connections to just about any kind of system. Managed file transfer systems are generally limited to a handful of transfer protocols
  • Realtime support – with the exception of AS2 transfers, most MFT transfer products are not well suited to synchronous transfers, whereas middleware will generally handle synchronous transfers without problem

In Summary

When considering simple automation, large file transfers or user initiated transfer, Managed File Transfer is better suited than middleware. When looking to introduce complicated interfaces, message transformations or realtime processing, consider using middleware.

The best solution however must come when there is a symbiosis of the two; traffic passes through a Managed file transfer system and is handled by the middleware product. From an automation perspective the flexibility of Managed File Transfer represents a tactical solution, whilst more persistent interfaces are developed using middleware.

If your company is considering implementing a system for securely exchanging data and integrating it into your internal network, you’ll need to know whether the features you require are provided by the leading managed file transfer solutions, or middleware systems. Download our free managed file transfer comparison guide, which provides an ‘at a glance’ list of features and much more:

Download a Comparison of 8 Leading Managed File Transfer Solutions!

 

MFT_Comparison Guide Img

In this essential pack you’ll also find…

 

  • Key features and frequently asked questions

  • Other business policies that will need to be considered

  • Access to additional resources

  • Side by side comprehensive comparison

    * Updated to include new vendors (October 2015)

Manchester United and the Exploding Mobile Phone

Manchester United and the Exploding Mobile Phone

It hit me at 3.20pm on Sunday, as fans started to pour out of the stadium that the game had been cancelled. The end of season party was over and the 3,500 travelling Bournemouth fans fell silent, the inflatable beach balls were popped. This wasn’t how it was meant to end.

At 8.00 a.m. that morning we’d left Bournemouth in eager anticipation at seeing the mighty Cherries playing at the Theatre of Dreams. We’d a five-hour road trip ahead of us but it didn’t matter, Bournemouth were safe from relegation and we were going for an end of season party. Nothing could spoil our day, or so we thought!

The sun was shining in Manchester as we approached the ground just before 3.00 pm. From outside the ground all we could hear was the Bournemouth fans in full song, the stadium was ringing with, “We’ve got more fans than you”. It should have registered then that something was afoot as our fans should have been, outnumbered 20 to 1.

secure-old-trafford-pic-1

We passed through the gates with security guards searching us, our tickets checked and only then were we permitted into the ground. After a quick pint we made our way out from the bowels of the stadium to our seats. Nothing could prepare us for the sight before us. The pitch was empty, no ground staff or player’s warming up, the North and East stands in front of us were completely empty, with not a soul in sight. Only the West stand contained any United fans and of course the 3,500 noisy Bournemouth fans in the away end. Something was wrong, very wrong!

secure-old-trafford-pic-2After a couple of minutes at our seats we noticed the arrival of sniffer dogs and security personnel scouring the other side of the stadium, which suggested something serious. However the regular communications over the tannoy system and with stewards close by keeping us updated as they got news, the fans felt assured they were in good hands.

When the announcement finally came that the game had been abandoned due to security concerns fans were efficiently ushered through the nearest exit and away from the stadium by the security personnel outside.

What Manchester United demonstrated on Sunday, was meticulous planning that had gone into dealing with a security breach. They had the reporting procedures in place, lines of communication open to ensure constant updates were available, resources in situ to manage developments at a local level and if needed, they knew how to quickly close down operations to avert a disaster. Whilst I was irritated spending nearly 11 hours in total travelling and not getting to watch the football, you had to wonder at the efficiency shown by Manchester United.

It begs the question, what processes, procedures, lines of communication and disaster planning does your business have in place to cater for a security breach. Can your company demonstrate Premier League standards to your customers? Does your organisation have the visibility it needs to know when there’s a problem and are your processes robust enough.

If you’re not sure, then the answer is probably no. This week we’re offering 10 free audits to assess whether your current file transfer strategies and technology is up to Premier League standards. So if you’d like to stay in the game, when all around you are trying to knock you off top spot, speak to one of our data security consultants today on 020 7118 9640.

Secure File Sharing at the Local Government Strategy Forum

Secure File Sharing at the Local Government Strategy Forum

Heythrop Park, April 12th – 13th

This month I attended my second Local Government Strategy Forum, at the beautiful Heythrop Park Resort in Oxfordshire. Invited by our partner Maytech, I was the ‘independent industry expert’ and had the pleasure of spending two days in this lovely environment, talking with senior management and C-suite executives from councils all around the UK.

heythrop-park

Before attending these events I had, what I believe to be a commonly held opinion, that council workers were underworked and overpaid. I’d read all the stories in the local press about the six-figure salaries and the cancellation of services to ensure their lavish lifestyle. However I’d never stopped to think what they actually did. Listening intently at these events has given me a small insight into the workings of councils, and whilst I’m sure there is still more efficiencies to be realised, I couldn’t have more admiration for the wide range of services they provide and the challenges they have prioritising them to balance the books.

The financial challenges being faced by councils has lead to them adopting a more business-like approach. They are looking at every aspect of their business to drive out wastage and streamline operations, and that’s where my expertise came in.

John Lynch, CEO, Maytech – presenting Quatrix on day one

Over the duration of two days I spoke with in excess of 50 delegates about their data sharing, collaboration, secure file transfer and business process automation challenges. Our experience in this area, working with council’s such as Cambridgeshire County Council, North East Lincolnshire Council and most recently Mid-Sussex Council, ensured we already had a view on some of the challenges being faced for data sharing in the public sector.

As ever there’s not one technology, which addresses the wide range of data sharing requirements of councils, our council customers are using solutions from five of our suppliers. The service we provide is to help them to fully understand their requirements and then choose the right solutions for their needs and budget.

If your council or company needs to address its file sharing, collaboration and secure file transfer requirements why not download one of our free resources below:

What is Managed File Transfer?

Managed File Transfer Starter Pack

Comparison Guide

Building a Business Case for MFT