S3 vs. NFS: A Practical Comparison

Approximately remain in this minutes read.

S3 vs. NFS: A Practical Comparison

Written by

Both Amazon S3 and NFS can be used to provide access to static content. Your web page can call an NFS file exactly like a local file using only the file path, without even the need to add the full URL. S3 is preconfigured to act as a static web server, so each object has a URL. If you give public or even controlled access to an object, your web page can reference the full URL in S3. Putting static content and media on S3 significantly reduces resource requirements for your EC2. You can provide video streaming of huge static files without provisioning an expensive Virtual Machine.

There are several domains of comparison when it comes to NFS as opposed to a cloud storage such as S3. These include management, availability, and performance. This piece provides guidance to IT infrastructure architects and operations regarding the challenges they might face when dealing with both types of storage systems.

Mount and Access

NFS acts like a local drive, but if you want to make the files it stores publicly accessible, you need to configure a server that has access to serve the contents of the mounted directory.

S3 is configured to serve objects via HTTPS interfaces using REST, SOAP, and BitTorrent.

As a public cloud storage solution, S3 is configured to serve files to the Internet from the start. It defaults to high security that only allows the creator to access it, so access should be set according to your needs.

With NFS, you need to spend a lot of time setting each one of these protocols up on a server that has mapped a drive. As S3 serves all of its objects through a web interface, it is important to point out that it is complicated to set up NFS to serve via REST, SOAP, and BitTorrent. Using NFS also increases the lag when using these protocols to serve files via the Internet, though it might be lightning fast in your office.

As object storage, S3 is not configured to allow you to easily mount S3 buckets to computers at your data center or even to EC2 instances. Utilities such as S3FS can mount S3 to your local Linux PC; however, you might find these buggy and unreliable. You should consider these in cases such as when you need to support your legacy application or “manual migration” of data. Check file sizes and numbers to make sure all of them were copied.

Once you understand that S3 is different from NFS and other kinds of storage, you will be able to use S3 for many purposes and with ease. A very popular tool for accessing S3 is the S3 Tools package. It is a CLI program written in Python. You can include its commands in your app, if necessary, or just use the S3 API.

When you have thousands of files in a subdirectory, S3 takes longer to find them. This is because each object is stored with its metadata and file name together. NFS is based on a block storage file system that indexes file names. Amazon gives advice for significantly increasing the speed of accessing a single file in a bucket with thousands of objects: Assign a random hash prefix to each file name before the time created part of the filename. We are used to storing files in a hierarchy using a slash to move into the next layer. S3 stores the objects all together; however, it recognizes slashes as delimiters to help you organize things. The S3 panel uses the delimiters to show you a hierarchy just like other storage.

Amazon made available a product that tries to combine the usefulness of both S3 and NFS. It is called EFS, and you can mount it via the NFS protocol and get unlimited file storage. Note that it can only be mounted within an AWS VPC, not from your data center via the Internet. But, consider the EFS price as it is more expensive than S3.

Learn how to create a public cloud experience in-house.

Deletion and Versioning

Deleting data that you don’t want is quick and simple with NFS. While using NFS, you can simply issue an “rm” command with whatever variables you need to tell it to delete specific files among thousands. And, you can pipe it using “find” and “grep” and all of the other powerful Linux commands. Using the rm command, the files disappear almost immediately after you press Enter.

On S3, deleting thousands of files is cumbersome. Although dropping a bucket is quick, you might want to keep some files in the bucket. Even sending all of the variables properly through S3tools CLI can take many hours to delete the files. The easiest way to delete old backup files is to set an expiry time for files in the S3 bucket list. Then, they disappear after the configured date. As shown above, in the newly improved AWS panel, the properties appear in the right column when you select a bucket, and from there you can configure the Bucket Lifecycle.

NFS does not keep different versions of files, nor does it archive “deleted files” as most cloud services are doing today. S3 can be configured to keep versions of objects in buckets, so when you update and delete them you can always recover previous versions. This is handy, and means you don’t need to worry about continuous and consistent backup of your S3 buckets. As we can see, the AWS S3 console and user experience make a bucket setup really easy.

Security

NFS and S3 are diametrically opposite in their simplest configuration. NFS has no security built-in capabilities and can be easily spoofed. As we can see in the below sample NFS Server config file, given rights to a host can be easily spoofed with both servername and the user UID that owns the files.

As mentioned, in comparison S3 is by default only accessible to the account that created it and nothing else can read it without explicitly allowing.

Both can be configured to be controlled via LDAP which is used to manage network configurations. AWS has a useful tool called AWS Directory Service that manages your resources like Microsoft Active Directory service, which uses LDAP to control access to both NFS drives and S3. Therefore, if you need both, you can use the same security system.

When using LDAP for NFS security, the ability to manage files and directories is the typical standard restriction for users, especially on a physical mounted drive.

When S3 Storage is used, users’ access, passwords, and keys are managed via the IAM policies assigned to each bucket. As shown below, when setting up a new bucket, the Create Bucket Wizard helps you set permissions. Though the easy-to-use console, it’s a bit more complicated to control and set the different buckets access correctly.

Determine how to appraise the economic impact of a private cloud.

Final Notes: The Compatibility Challenge

NFS will not be going away anytime soon and S3 will not miraculously become exactly like mounted file systems, so you need to know which one to use to best meet your needs. In comparison to legacy system that heavily rely on NFS, new code that you grab off of Github, for example, will use the S3 API by default. And modern scalable architectures are pretty distributed nowadays. Adopting the cloud means you will need to find a way to use both S3 and NFS. While this comparison may help you decide which to choose, it might better serve you to learn the gaps when trying to use both, and when integrating them.

Naturally, there are challenges when attempting to build a secure and streamlined hybrid storage. In a broader sense, data on-premises is largely unstructured and non-objectified. Although cloud object storage has its great advantages, in a hybrid infrastructure configuration, complexities arise. Whether you want your on-premises seamlessly work with Amazon cloud, or connected to Azure, you have to ensure compatibility as part of your storage implementation plans. Incompatibilities between the NFS file system and the cloud structured object blocks should be taken into consideration, and can even make rethink your whole cloud migration plan.

One of the most important factors to consider is how you will utilize your on-premises storage infrastructure (e.g., NAS or SAN) and the fabric in your hybrid approach. Look for software-defined storage (SDS) solutions that can abstract the underlying storage system and unify your on-premises solution with your public cloud environments. And make sure to deep dive into each of the IT aspects including security, performance availability, and overall day-to-day operations.


May 4, 2017

Simple Share Buttons
Simple Share Buttons