How to Clone a Linux Box into Amazon EC2

How to Clone a Linux Box into Amazon EC2

Upon reading the title of this article, many of you probably thought, “Why would I want to do that?  In EC2, I have tons of preconfigured images from almost every Linux distro, and so I simply have to launch an instance, install what I need, and that’s it.”

I can think of many reasons why I have to “clone” things (these days it has become the it thing to call this P2).  Let me mention a couple of them:

  • I have a Linux server that for many years I have “souped up” however I want it. I have installed (and configured) a lot of software (Gnome, NxServer, Samba, compilers, etc), as well as applications from my company, databases, etc.  For contingency purposes to counter disasters, I need back up equipment in the Cloud.  I am super lazy to re-install and re-configure from scratch, and so it is best to clone it.
  • In order to run tests, I want to set up a distributed environment, or a grid of identical machines, all in the Cloud (I don’t want to buy hardware just to run tests).  If I could prepare one machine with all I need, it would be good to export the image to EC2 and to be able to generate all the instances that I need.

 

Now we are going to watch a procedure to be able to do this, as well as use only tools/software that are free and open source.

Source computer

Let us start with a computer with these features:

  • Dell PowerEdge R415
  • 2x AMD Opteron 4184 (12 nuclei in total)
  • 32 GB RAM
  • 2x 300 GB SAS discs at 15,000 RPM in RAID1
  • CentOS 6.9
  • Gnome 2.28.2

 

What are we going to do?

There are various tools for Windows that allow us to generate and clone EC2 with an image from our system.  Unfortunately, at the time of this writing, nothing similar to Linux exists (VMConverter by VMWare makes something similar, but with a lot of restrictions, and of course, it is not open source), so the procedure will be a bit complicated, and in short, we are going to:

  • Generate an ISO image from our system, with the portion of our disks that we are interested in.
  • Boot a virtual machine in VirtualBox from the ISO image, and restore it.
  • Convert our VM to a neutral format.
  • Upload the image in neutral format to Amazon S3
  • Import the image to EC2 in AMI format.
  • And then launch an instance of this AMI.

 

Generate ISO Image

For this phase, we have decided to use the “Mondo Rescue” tool.  It is a bit difficult to handle, but very potent and versatile.  The first we need to do is install it in the source computer.  We start by creating the Mondo repository (Be careful.  From this point, we should be in root):

# cd /etc/yum.repos.d
# wget ftp://ftp.mondorescue.org/rhel/6x86_64/mondorescue.repo

 

(Take note that we have chosen the repository that corresponds to RedHat 6, which is compatible with CentOS 6.  If we had another version of Linux, we would use the corresponding repository).

To continue, we install “mondo” and all its dependencies (yum will automatically resolve this):

# yum install mondo

If all has gone well, and before generating the ISO image, the following tasks have to be performed beforehand:

Create (if it doesn’t already exist) the directory where the images will be generated.  In our case “/home/sysimg”.

Our equipment uses SAS disks (Serial Attached Scsi), while probably, the destination uses SATA or AHCI.  For this reason, we have to “force” mondo to include the required drivers.  To do this, we edit the file “/usr/sbin/mindi”, we locate a line that starts with “FORCE_MODS” and change it to the following mode:

FORCE_MODS=“ahci matss mptspi”

Now we can run the backup.  This is an example of the run command (Take note, it is only one line):

# nohup mondoarchive -z -O -V -N -V -i -s 8960 -1 GRUB -d /home/sysimg -I / _E “/home/comunytek I home/sysimg” -9 <&- >mondo.log 2>&1 &

As you can see, we have run the command in background, given that it will take a couple of hours. If we want to see how it is progressing, we can enter:

#tail -f mondo.log

If anybody is interested in the details of the “mondoarchive” command, it is described in the following link:

http://www.mondorescue.org/docs/mondorescue-howto.html

The biggest portion of the options we have used are almost self-explanatory, but we cite here some that are important:

-z: Force mondo to include the necessary packages to restore SELinux.

-N: Does not include any NFS mount in the backup

-s 8960m: Defines the maximum size of each ISO image.  In order to avoid generating more than one ISO, we estimate a size that is approximately double the standard of one DVD.

-E”/home/comunytek/home/sysimg”: Excludes the backup of the directories that are not absolutely necessary to us and, above all, the directory where we are going to generate the ISO.

Creating a virtual machine

We are going to use the virtualization environment Oracle VirtualBox (free), installed in a Windows equipment.  It is quite intuitive; however, we are going to specify some important steps:

  • Create a VM of the “Linux/Red Hat 64 bits” type.
  • Give it sufficient memory, but it is not necessary to assign it 32 GB from the source computer, but something “reasonable” so that it can be booted (for example, 2 GB).
  • Create a disc with dynamic growth and sufficient space to be able to clone our source computer.  In this case, we have estimated that we need no more than 60 GB.  Later on, we will explain why.
  • In Settings/Storage, add a DVD IDE, indicating the ISO image (which we have preciously copied into our local disc).
  • In Networks, select “Adaptor point” and the physical NIC that we want to use.

Before booting our new VM, we have to carefully plan how we are going to partition our virtual disk.  We have to maintain exactly the same partition nomenclature of the one in the original computer, but we should “adapt” to the new disk size.  Our source computer has 300 GB, but the disks are not at all full.  In addition, we have directories that do not interest us (logs, work files, etc.).  In this case, the originals partitions are:

Disposit. Inicio    Comienzo      Fin      Bloques  Id  Sistema
/dev/sda1               1           5       32812   de  Utilidad Dell
/dev/sda2   *           5         266     2097152    b  W95 FAT32
/dev/sda3             266        4345    32768000   83  Linux
/dev/sda4            4345       36405   257522688    5  Extendida
/dev/sda5            4345        8425    32768000   82  Linux swap / Solaris
/dev/sda6            8425       36405   224752640   83  Linux

 

But the important partitions (/dev/sda3y and /dev/sda6) are not full, as we can see with command “df-h”:

Filesystem            Size  Used Avail Use% Mounted on
/dev/sda3              31G   11G   19G  35% /
tmpfs                  16G  133M   16G   1% /dev/shm
/dev/sda6             211G   47G  154G  24% /home
192.168.0.115:/nfs/SysImages 1,8T  736G  1,1T  41% /mnt/wdnas

 

Therefore, we can estimate the sizes that we are going to need in the destination disk, and annotate it.

Now we can start the virtual machine that will boot from the ISO image and initiate the recuperation process.

Restore with mondorestore

When the VM boots, the mondorestore prompt will appear:

boot:

Here we have to input “expert” (if we use any other mode, it simply will not work, because of the peculiarities of the Dell original server).  Mondorestore will charge one Linux micro-kernel and will leave us in system prompt.  To continue, we have to partition the disk using “fdisk/dev/sda”.  The details on how to apply this command is beyond the scope of this document.  In this case, we have to end with a partition similar to the following:

Device      Start End   Blocks      Id    System
/dev/sda1   1     6     48163+      de    Dell utility
/dev/sda2   7     307   2417782+    b     W95 FAT32
/dev/sda3   308   2308  16073032+   83    Linux
/dev/sda4   2309  7832  44371530    5     Extended
/dev/sda5   2309  2440  1060258+    82    Linux swap
/dev/sda6   2441  7832  43311208+   83    Linux

 

Note that the partitions /dev/sda1 and /dev/sda2 that the Dell servers will always employ have been respected.  If they do not exist, our virtual machine will not boot (Don’t ask me why because I don’t know, but this is based on personal experience).

Note as well that the digits that appear as Start and End are measured in cylinders, while the “Blocks” column indicates the size in KB.

Now we can launch restore.  We simply enter “mondorestore” and we start responding to the questions in the following manner (basically we don’t want to partition our disk, but yes, we can reformat and restore the content of the ISO):

Mode: Interactive
Origin: DVD
Editing Mountlist <OK>
Continue anyway? YES
Partition hard disks? NO
Format hard drives? YES
Restore all of your data? YES
Initialize the bootloader? YES
Regenerate initrd? YES
..rebuild..your initrd..? OK

 

Here, mondorestore will once again show the System prompt.  To regenerate our initrd, we have to do the following:

# cd /boot
# ls (para ver las imágenes de que disponemos)
# mkinitrd --force initramfs-2.6.32-696.xxx.img 2.6.32-696.xxx (usar siempre las imágenes más recientes)
# exit

 

With this we return to mondorestore, and it will ask us if we have modified the mountlist.  We answer YES, and it will explain that we are going to edit certain System files.  We say YES once more, and it will successively present to us an editor so that we can modify (if we want to) a series of files.  The editor is self-explanatory.  In principle, we only have to modify the files “fstab” and “mntab”, eliminating the lines that contain an NFS mount point (if there is one).

After editing, it will seem like we have finished, but we have to wait a few seconds and a window will appear saying that the “grub” bootloader has initialized.  We put OK and mondorestore will do a final cleanup and will leave the System prompt.  We have finished the restoration, and so now we have to shut down the virtual machine.

Cleaning up the virtual machine

Once the VM is shut down, we go to Settings/Storage and delete the connection to the ISO image.  Now we return to start up the VM, which we should boot up with CentOS until we see the GNOME login screen.  We input our usual username and open a terminal window.  It is important to “clean up” the following (all this has to be done as root):

  • Uninstall applications or utilities that we are not going to need in the Cloud.
  • Change the hostname, in order to avoid conflict of names.  To do this, edit the file “/etc/sysconfig/network” and change to “HOSTNAME=xxx-bak” (or whatever you wish).  Then, input “hostname xxx-bak”.  Lastly, edit the file “/etc/hosts” to ensure that the host name has been changed and that there is no IP address conflict.
  • It would be good to review the content of “/home” in case we have left behind unnecessary items that should be erased (remember that our virtual disk is 5 times smaller than the original).
  • Check that all is functioning well and that we are not missing anything important, etc. and shut down the VM.

 

Convert the virtual machine to a neutral format.

The VMImport of Amazon EC2 utility allows us to import many kinds of images but, at this time, does not support VirtualBox formats.  Thus, we have to convert our virtual machine to a format compatible with VMImport.  In this case, we will use the neutral format OVA.  Luckily, this process is very simple:

  • in VirtualBox, select “Archive/Export Virtualized Service”
  • Select the virtual machine that we want to export
  • We input the complete path of the exit file, for example, “d:\VirtualBox\xxx-bak.ova”.
  • It will take a while to convert, but finally we will have an OVA image.

 

Upload the OVA file to Amazon S3

Of course, we must have an account in AWS, and we should have created some kind of bucket where we can store our image.  Now we can use the tool that is most convenient for us to upload our image.  I usually use “s3browser” which works very well.  It would be good to create a new directory for our new image in the destination bucket.  We can call it “VM” or something like that, and upload the OVA image directly to it.

Installing the AWS Command Line Tool (CLI)

The EC2 export/import tools do not have a graphic interface (for now), and so the interphase of line of command must be used.  In Windows, it is easily installed, downloading an msg file, for example from:

http://docs.aws.amazon.com/cli/latest/userguide/awscli-install-windows/html.

Once it has been installed, the first step is to make a basic configuration of our AWS.  To do this, in a cmd window, we input “asw configure”.  We answer the question (passwords, region, etc) and now we can run AWC commands.

Import the image to an ECS instance

In order to import our OVA image, we must create (in the same directory where we have the image), a fille, “xxx.json” with xxx being the image name, with contents similar to the following:

[
  {
    "Description": "Mi maquina OVA",
    "Format": "ova",
    "UserBucket": {
        "S3Bucket": "vuestro_bucket",
        "S3Key": "VM/vuestra_imagen.ova"
    }
  }         
]

 

Then we run the importation command (be careful, this is only one line):

D:\VirtualBox> aws ec2 import-image --description "Clon from OVA" --license-type BYOL --role-name vmimport --disk-containers file://xxx.json

It is important for us to annotate the id of the importation task, because we will need it later.

IMPORTANT: the vmimport role has to be created beforehand. Refer to how it is done in the IAM documentation:

http://docs.aws.amazon.com/vm-import/latest/userguide/vmimport-image-import.html

The command is remotely run in EC2, and it takes quite a while, so be patient.  If at any time we want to check the status of the task, we can use the following command:

aws ec2 describe-import-image-tasks --import-task-ids import-ami-xxx

Where xxx is the id of the import task that we previously noted down.

Once the importation has been finalized, we can proceed to launch an instance in EC2.

Launch AMI in EC2

This step we have to do in the AWS console, and it is very simple:

  • We go to EC2 service and click on AMIs.  A new AMI corresponding to our import should appear.  In the “AMI Name” column, the import task name will appear, which we had noted down before.
  • So as not to get confused in future, it would be good to change the “alias”.  We just need to drag the mouse over the “Name” column and an editing symbol will appear.  We click on it and enter an explanatory name, for example, “milinux-bak”.
  • A wizard will be executed that will ask us the kind of instance, the IAM role, the SecurityGroup, the EBS disk size we need (in this case, 60GB), etc.  (I presume that the reader is familiar with Amazon EC2).
  • Once launched, the instance will initially be in “stopped” status.  In the EC2 console, if we click on “Instances”, we can select and click “Actions/Instance State/Stat”.  We should wait until it is completely booted.  In the lower portion of the window, we will see all the details of our new instance, including the public IP address that EC2 has assigned (initially this IP is volatile).  We should now be able to connect with Putty or Nomachine, and even access through Samba (depending on our original server).

 

Assign a static public IP

This last step is optional.  The idea is to assign a public IP to the instance so that it remains constant, irrespective of whether the VM is booted or stopped.  For this we use an “Elastic IP”:

  • We retain the instance, and wait till it is completely stopped.
  • We click on “Elastic IPs”.
  • We click on “Allocate new address”.  A new IP will be generated for us.
  • We associate this new IP with our new instance.
  • Lastly, boot the instance once again, which will now always appear with this public IP.

 

And that’s it.  We now have contingency environment in EC2.  We hope you enjoy it!