Setting up Ubuntu on GPU EC2 in AWS for ML with PyTorch – part 2

In the first part of this post, we showed you how to set up a G2 class EC2 host in the AWS cloud and how to install Tesla GPU drivers for the Ubuntu operating system on it. Now all that’s left to do is to configure the PyTorch environment and run it on the interactive Jupyter Notebook platform.

One of the easiest, but also more productive methods is to use the dedicated data science package – Anaconda. At the time of writing this post, the most recent version is Anaconda3 (2020.11). Let’s download it as an archive file and install it.

wget https://repo.anaconda.com/archive/Anaconda3-2020.11-Linux-x86_64.sh
chmod +x ./Anaconda3-2020.11-Linux-x86_64.sh
./Anaconda3-2020.11-Linux-x86_64.sh

During the installation, we will be asked to accept the license conditions and specify the package installation location – stay with the default destination: /home/ubuntu/anaconda3

Finally, let’s confirm that we want the installer to initialize Anaconda3
by running conda init automatically.

Now, for the configuration to take effect, we need to reboot the Ubuntu 18.04 operating system or execute the command:

source ~/.bashrc

With Anaconda installed, we can now focus on the PyTorch package. In addition, we will show here how to install the recently gaining popularity FastAI package – an overlay for PyTorch enabling faster prototyping:

git clone https://github.com/fastai/fastai2
cd fastai2
conda env create -f environment.yml

The above commands will prepare the FastAI environment to be installed. Now, let’s install in newly configured environment the latest version of the FastAI platform.

pip install fastai2
pip install nbdev
nbdev_install_git_hooks
conda install pyarrow
pip install pydicom kornia opencv-python scikit-image

If we would like to additionally download the latest FastAI user manual, we can do it by executing the command:

git clone https://github.com/fastai/fastbook

Now all that’s left to do is start the Jupyter notebook server:

jupyter notebook

This will start the Jupyter server on localhost on the default port 8888. To be able to use Notebooks in the browser of our workstation, we only need to configure the tunnel and enter localhost: 8888 in the browser’s address bar, and when prompted for a token, copy it from the EC2 host console.

Now you can check if the PyTorch library is properly configured and can communicate with the GPU.

torch.cuda.is_available()

Now you are ready for your exciting deep learning adventure!

Leave a Reply

Your email address will not be published. Required fields are marked *