Select Page

Problem Statement:

A company is hosting a web application on AWS using a single Amazon EC2 instance that stores user-uploaded documents in an Amazon EBS volume. For better scalability and availability, the company duplicated the architecture and created a second EC2 instance and EBS volume in another Availability Zone, placing both behind an Application Load Balancer. After completing this change, users reported that, each time they refreshed the website, they could see one subset of their documents or the other, but never all of the documents at the same time.

Replication of the problem:

An EC2 Instance is created in the Mumbai region and in the availability zone ‘ap-south-1a’, inside which a web application is hosted. 

Launch an EC2 Instance:

  • Choose an Amazon Machine Image (AMI), such as Amazon Linux 2.
  • Select an instance type (e.g., t2.micro for free tier).
  • Configure the instance details.
  • Add storage 
  • Configure security groups to allow HTTP (port 80) and SSH (port 22).
Fig: EC2 instance in ap-south-1a Availability zone

Fig: EC2 instance in ap-south-1a Availability zone

After launching the EC2 instance, Web Server is installed and configured, where Apache and PHP are installed and a simple web application is created in which a user can choose and upload a file. 

Upload File

The uploaded file is then stored in the EBS volume attached to the Instance. 

For better scalability and availability, we duplicated the architecture and created a second EC2 instance and EBS volume in another Availability Zone ‘ap-south-1b’, placing both behind an Application Load Balancer.

Fig: EC2 instance in ap-south-1b Availability zone

Fig: EC2 instance in ap-south-1b Availability zone

Fig: Load Balancer

Fig: Load Balancer

Now let’s upload a file in an Instance created in both availability zones.
A text file named ‘Instance1-uplaods.txt’ has been uploaded through the instance created in ap-south-1a and a text file named ‘instanc2-uploads.txt’ has been uploaded through the instance created in ap-south-1b.

In the images shown below,  Image 1 shows the document uploaded in the ap-south-1a instance and Image 2 shows the document uploaded in the ap-south-1b instance when refreshed.

Every time the web application is refreshed we can see one subset of their documents or the other, but never all of the documents at the same time.

The web server is only serving content from one EBS volume at a time. The EBS volumes are not synced.  When the DNS of the load balancer requests the web server, only one of the EC2 instances is accepting the request.

After refreshing:

Solution:

Architecture-diagram

Navigate to the EFS console and follow the following steps: 

  • Go to the EFS service in the AWS Management Console.
  • Click “Create file system” and follow the prompts to create a new EFS file system.
  • Select the VPC where your EC2 instances are running.
  • Use the default settings for availability and performance options.

Mount the EFS File System on EC2 Instances:

Install the NFS Utilities:

  • SSH into each of your EC2 instances and install the NFS client:

sudo yum install -y amazon-efs-utils

 

Create a Mount Point:

  • Create a directory where you will mount the EFS file system.

sudo mkdir /mnt/efs

 

Mount the EFS File System:

  • Mount the EFS file system using the EFS file system ID.

sudo mount -t efs <file-system-id>:/ /mnt/efs

 

Verify the Mount:

  • Ensure the EFS file system is mounted correctly.

df -h

 

Make the Mount Persistent:

  • Add an entry to /etc/fstab to ensure the EFS file system is mounted on reboot.

echo “<file-system-id>:/ /mnt/efs efs defaults,_netdev 0 0” | sudo tee -a /etc/fstab

 

Update the PHP Application

Navigate to the web server’s root directory.

Update your Web Application script if necessary.

Set Permissions and Restart Apache:

sudo chmod 777 /var/www/html/<your web app script file>
sudo systemctl restart httpd

 

Once the EFS is mounted with the EC2 instances, add different files through the EC2 instances from your web application.

In our case, we have uploaded text files from two different EC2 instances and when the Web application is accessed through the load balance DNS we can see all the uploaded files and it keeps on showing all the files also after refreshing the web application every time. 

Two different text files named ‘new-uploads1.txt’ and ‘new-uploads2.txt’ are uploaded in two different instances.

When the application is accessed through the load balancer we can see all the uploaded files from two EC2 instances:

Implementing a web application with multiple EC2 instances across different availability zones presents challenges in maintaining data consistency when using EBS volumes, as they are not automatically synchronized between instances. By integrating Amazon Elastic File System (EFS), we effectively resolved this issue. EFS provides a scalable, fully managed shared file storage solution that can be simultaneously accessed by multiple EC2 instances, ensuring that all instances see the same data in real-time. This approach not only guarantees data consistency but also enhances the scalability and reliability of the application. By leveraging EFS, we can seamlessly manage file storage across multiple availability zones, simplifying the architecture and improving the overall user experience.