Today data is the life blood of the majority of organisations and that data is typically spread across on and off-premises (cloud) systems. Building a cost-effective, well protected, secure, cloud optimised data strategy is therefore more important than ever. Let’s explore what we see as 12 of the most important data strategies for 2021.
Quick Links
- Use the cloud where it makes economical & practical sense
- Keep things as simple as possible & avoid silos by using smart unified storage
- Use flash for active data & the cloud or NL-SAS for inactive data
- Reduce on-premises & cloud storage costs with proven efficiency technologies
- Reduce complexity by minimising the use of SAN LUNs
- Maintain offsite backups for all on & off-premises services
- Use storage that provides integrated snapshot-based data protection
- Replicate data to a secondary site or the cloud for disaster recovery
- Use active/active storage when continuous availability is required
- Monitor the health, security hardening, performance & capacity of all services
- Use enterprise-grade NAS for large scale file shares
- Use file shares for collaboration rather than OneDrive
- Conclusion
Use the cloud where it makes economical & practical sense
Clearly the cloud is transforming IT, but it’s important that it’s not seen as the ultimate destination for all services and therefore you should not plan to migrate 100% to the cloud. Some organisations may well achieve this, but I suspect for most the best place for their data will vary – SaaS applications, such as Microsoft 365, may be perfect for some services, whilst others may be better served by on-premises solutions.
Most importantly do not move a service to the cloud until you are clear of the technical and commercial pros and cons – this is common sense 101, but it’s amazing how many organisations ignore this. Above all do not put all your eggs in one basket, hybrid cloud and multi-cloud are the ways to think about modern IT.
Keep things as simple as possible & avoid silos by using smart unified storage
Most people today would not have a separate MP3 player, Sat Nav, calculator, voice recorder, camera or mobile, instead they would have a smart phone.
The same principle applies to storage, why would any organisation want a different storage platform for mid-range, high-end, SAN, scale-out NAS, data protection, hybrid-flash and all-flash? Of course they wouldn’t, instead their objective would be to keep their IT as simple as possible by eliminating complexity at every opportunity.
To achieve this you need a smart, software-defined and constantly innovating storage OS that provides:
- A focus on ease of use, but not at the expense of functionality
- Support for all key storage protocols (NVMe, FC, iSCSI, NFS, SMB & S3)
- Support for both SSDs and HDDs, with automated tiering between them
- Tight integration with the cloud (i.e. replication, tiering & backup)
- Proven efficiency technologies (i.e. compression & deduplication)
- Powerful data protection, disaster recovery & business continuity features
- High availability including protection from double drive failure as standard
- Support for running:
- In the cloud (i.e. AWS, Azure & GCP)
- On commodity server hardware
- On multiple generations of hardware appliances
- Non-disruptive migration to new hardware appliances
- The ability to start small, scale large and scale both up and out
- Rapid adoption of new standards and technologies
Ultimately your organisation needs a storage OS that is akin to what you have in your pocket – Google Android or Apple iOS.
Use flash for active data & the cloud or NL-SAS for inactive data
Much like cloud is not the answer to all IT requirements, flash is not the answer to all data storage requirements. Equally important storage performance is no longer a technical challenge, as most vendors can provide more than enough IOPS for the vast majority of needs (there are of course edge cases that do need extreme performance which is not so straight forward).
Performance is therefore no longer something to focus too much on, instead cost, simplicity and features are far more important. The challenge for vendors is to deliver the required capacity at a low cost, this requires real-world capacity savings using a set of advanced data reduction features and the utilisation of lower cost storage tiers (cloud or NL-SAS drives) for the inactive data.
As with everything to do with the cloud don’t assume it’s the superior solution and NL-SAS drives are legacy, both solutions are valid you just need to weigh up the technical and economical pros and cons, and most importantly you need a storage platform that supports the ability to:
- Efficiently store your key application data 100% on SSDs
- Efficiently store your low IO data 100% on NL-SAS HDDs
- Intelligently tier inactive data to the cloud or NL-SAS drives
Reduce on-premises & cloud storage costs with proven efficiency technologies
The cost of flash makes the most sense when you can apply storage efficiency technologies to increase the effective capacity to somewhere around 3:1, which is much easier said than done and requires a mature storage OS that has a range of proven efficiency features.
Many vendors will claim high effective capacity numbers for their platforms, but these will not be achieved in the real-world if they only have simple volume based deduplication and compression. You need to get them to explain in detail how their technology works and I would recommend looking out for features like:
- Compression – should be inline and post-process, with different algorithms applied to hot and cold data (i.e. small blocks for hot and larger for cold)
- Deduplication – should be inline and post-process, and work across volumes
- Automation – should be set and forget, with the ability to turn off features if they add no value
- Block sizes – should be small (i.e. 4K not 16K), with the ability to allocate more than one logical block per physical block
There is no silver bullet, instead you need a blend of technologies to deliver high capacity savings, in addition space efficient cloning and tiering to cloud object storage or NL-SAS drives will help to reduce costs even further.
It’s also worth thinking about cloud storage much like flash, natively it’s expensive so you should also apply all of these benefits in the cloud to help keep your costs down.
Reduce complexity by minimising the use of SAN LUNs
I do not believe that most IT technologies become obsolete overtime and therefore should be ripped out and replaced with more modern solutions, some examples of this are replacing HDDs with SSDs, replacing tape with disk or cloud and replacing entire data centres with the cloud – sometimes we should, but clearly nowhere near 100% of the time.
The one technology that I do think should go “the way of the dodo” is LUNs, it’s a legacy technology that emulates a local drive, but makes it sharable. It has not really been enhanced over the last few decades and it amazes me that so many storage administrators and vendors still focus exclusively on the technology (to be fair I know why the vendors do it – it’s fairly simple so it keeps their development costs down).
What if there was a better way of provisioning storage that could:
- Non-disruptively expand and shrink (try doing that with a LUN)
- Grow to almost any size (no more managing 100s or even 1,000s of LUNs)
- Configure QoS performance policies at the individual VM/database level
- Enable browsing of snapshots to explore the content and restore individual files
Whether you are using VMware vSphere, Hyper-V, SQL Server or Oracle, NAS datastores will provide all these benefits, and as for performance modern NFS and SMB protocols can hold their own against Fibre Channel and iSCSI with the only exception being NVMe, which is sill the optimum protocol if extreme performance is required.
Of course there may be edge scenarios where only LUNs are supported, which is why it’s important to have a storage platform that can support modern NAS datastores alongside “legacy” SAN LUNs.
Maintain offsite backups for all on & off-premises services
This of course is IT administration 101, no one would ever maintain an on-premises infrastructure without the ability to recover from multiple points in time, with backups held offsite using tape, disk or cloud storage. If a low RTO (time to recover) and RPO (amount of data lost) is required then data replication is also typically deployed.
Now it has to be said that it’s very unlikely that an organisation will have a catastrophic data loss and will need to restore complete systems or even data centres from backup or replicas, but it clearly does happen so why do most organisations apply completely different data protection and disaster recovery standards when it comes to the cloud?
Much like with on-premises infrastructure, it’s unlikely you will have a catastrophic failure and lose data stored in the cloud, but it does happen so you should maintain independent backup and replica copies – maybe on a completely different cloud (i.e. use AWS to protect your Azure data).
Use storage that provides integrated snapshot-based data protection
Historically shared storage was dominated by expensive SANs, that let’s be honest didn’t do very much beyond presenting LUNs to multiple hosts and replicating data, today we need our storage technology to do so much more.
One key feature is the ability to provide integrated data protection with the following capabilities:
- Frequent immutable snapshots (i.e. taken every 15 minutes) to minimise data loss
- Replication of snapshots so you always have offsite copies of your historic data
- The ability to maintain 100s of snapshots without a performance overhead
- Recovery of complete volumes or just a selection of files from a snapshot
- Near instant restore from a snapshot to minimise downtime
- Replication of snapshots along with the data for disaster recovery for maximum efficiency
- The ability to efficiently replicate snapshots to the cloud for long term retention
- Creation of application aware backups (i.e. for VMware vSphere, SQL Server & Oracle)
- Integration of snapshot management with the leading backup applications (e.g. Veeam & Commvault), with the ability to backup to tape or cloud for long term retention
In this day and age if you want to quickly dig yourself out of trouble by restoring large amounts of data (maybe due to one of the ever more common ransomware attacks), don’t think backups think snapshots.
Replicate data to a secondary site or the cloud for disaster recovery
Since the dawn of the SAN high-end arrays have been able to replicate data, but basic replication today is not good enough and we need a rich set of capabilities including:
- Asynchronous replication – with an RPO that can be as low as a few minutes
- Synchronous replication – with an RPO of zero
- Active/active replication – for continuous availability (i.e. RPO of zero & near zero RTO)
- Cloud replication – to and between all major platforms (i.e. AWS, Azure & GCP)
- Efficient replication – that maintains the savings from compression and deduplication
You should be able to dynamically change a volume’s replication type (i.e. something that was configured with asynchronous replication could be changed to synchronous and then changed again to active/active) without the need to replicate the data again.
Use active/active storage when continuous availability is required
Technologies like VMware vSphere Metro Storage Cluster (vMSC) enable you to stretch a cluster across two sites. This is achieved by synchronously replicating the storage, providing read/write access to the data at both sites and automating the handling of any failure scenarios such as a site, storage device or link going down.
The net result is that a catastrophic failure at one site would result in all the virtual machines being automatically restarted at the second site using VMware HA, you would also have the ability to load balance virtual machines by non-disruptively moving them between sites using VMware vMotion.
Monitor the health, security hardening, performance & capacity of all services
Whether your applications are running on-premises or in the cloud you need to keep a close eye out for anything that might impact good service availability. From running out of storage, compute or network capacity, disabling snapshot backups, replication jobs failing, unpatched systems, to not keeping up with the latest security hardening best practices, if these are not fixed quickly they are likely to cause series performance issues or even significant downtime.
For example if you are hit with a ransomware attack or any other serious local data issue you should be able to recover in minutes by restoring from an immutable snapshot, so it’s essential that you make sure your snapshot schedule is being fully maintained. Likewise if you have a site level failure you should be able to recover to a secondary site using an up to date replica copy, you just need to make sure all of your replication jobs are running smoothly.
There are two sides to capacity issues, the most obvious one is when it directly impacts the business because there is not enough, the other is the exact opposite, when there is too much of a resource and therefore money is being wasted – this can be a particularly big problem with cloud services.
You must therefore deploy the tools and setup the processes to make sure that your IT services are always operating optimally.
Use enterprise-grade NAS for large scale file shares
Windows Server provides an excellent file sharing platform, but at scale (i.e. when you have many 10s/100s of TBs of data and beyond, or you just need many servers) an enterprise-grade NAS platform is much simpler to manage, secure, protect and expand – it can even support a single file system that can scale to many 10s of PBs.
It’s important to emphasise that there is a world of difference between enterprise-grade NAS and low-end solutions that tick the boxes in terms of providing basic NFS and SMB support. They just will not be able to provide the level of file serving functionality that is native to Windows Server and enterprise NAS solutions, that users need.
Use file shares for collaboration rather than OneDrive
You would think that OneDrive would kill off the file share, but the reality is it’s a great Home drive replacement and if you are using Office (Word, Excel or PowerPoint) it has the fantastic ability for multiple users to work on the same document and instantly see each others changes, but it’s not so good for non-Office files and when many users need to make changes.
If a number of people are working on the same file it’s quite common to end up with multiple versions, which is far from ideal, because it does not lock the file and therefore provide exclusive write access to just one user.
Whether you want to deploy them locally or centrally (i.e. in the cloud or in your head-office possibly using a local cache) file shares still have a place in most organisations.
Conclusion
By adopting all of these that are relevant you will be able to build a world-class data strategy for your organisation.
So is there a storage platform that can help you deliver such a strategy? Of course there is and the key is that it must not be LUN focused (I think you can tell I am not a fan) as data is far more about files than it’s about block storage. You will need a platform that puts NFS, SMB and S3 object protocols at least on a par with Fibre Channel and iSCSI.
You can learn more about such a platform by visiting Why NetApp for your on-premises & cloud storage and you might also like to read our blog Flash Optimised, Cloud Integrated & Multi-Protocol – the table stakes of modern storage
If you have some questions or if you require further assistance do get in contact.