Micah P. Dombrowski / Dec 03 2018

Compiling MXNet

1.
Install Prerequisites

Deps.

apt-get -qq update
apt-get install --no-install-recommends \
  software-properties-common apt-transport-https \
  build-essential cmake libjemalloc-dev \
  libatlas-base-dev liblapack-dev liblapacke-dev libopenblas-dev libopencv-dev \
  libcurl4-openssl-dev libzmq3-dev ninja-build libhdf5-dev libomp-dev
apt-get clean
rm -r /var/lib/apt/lists/* # Clear package list so it isn't stale

Install base Python.


Install other Python stuff. We'll use the Intel channel because it has all of the MKL stuff, and most of the other things we need.

conda install \
  setuptools six nose boto3 \
  pylint=2.1.1 'requests<2.19.0,>=2.18.4' \
  'numpy<=1.15.2,>=1.8.2' scipy=1.0.1 h5py=2.8.0
conda clean -tipsy

pip install cpplint==1.3.0 nose-timer

ldconfig

Install Maven.

MAV_VER="3.6.0"
FILE="apache-maven-${MAV_VER}-bin.tar.gz"

wget --progress=bar:force -O $FILE \
  http://ftp.naz.com/apache/maven/maven-3/${MAV_VER}/binaries/$FILE

tar -zxf $FILE -C /usr/local
rm $FILE

ln -sf /usr/local/apache-maven-${MAV_VER} /usr/local/maven

/usr/local/maven/bin/mvn --version

2.
Compile

Get MXNet source.

rm -rf /mxnet
git clone --recursive https://github.com/apache/incubator-mxnet /mxnet
cd mxnet
git checkout v1.4.x
git submodule update --init --recursive

The compilation. Needs absurd amounts of RAM (over 10 GB), takes over three hours. We edit the cmake defs to set the minimum CPU to the Sandy Bridge family. Also set a bunch of options—of special interest are:

  • USE_OLDCMAKECUDA because a newer cmake build setup fails with CUDA
  • CUDA_ARCH_* set which 'Compute Capability' identifiers we compile for—the current setting builds binary compatability for K80, P4, P100, and V100 GPUs, and JIT capability for the entire Kepler, Pascal, and Volta families
  • USE_SSE is set off, because we do not need that specifier when we set -march
  • IS_CONTAINER_BUILD makes things less picky about whether or not a GPU is currently available
ls /usr/local/cuda/bin
cd /mxnet

sed -i 's#30 35 50 52 60 61 70 75#37 60 61 70#' Makefile

make -j5 \
  USE_CUDA=1 USE_CUDA_PATH="/usr/local/cuda" USE_CUDNN=1 \
  USE_OPENMP=1 USE_MKLDNN=1 USE_OPENCV=1 USE_F16C=1 \
  USE_JEMALLOC=1 USE_THREADED_ENGINE=1 \
  ADD_CFLAGS="-march=sandybridge -mtune=generic" \
  ADD_LDFLAGS="-L/usr/local/cuda/lib64/stubs" \
  USE_LAPACK=1 USE_LAPACK_PATH="/usr/lib/x86_64-linux-gnu" USE_BLAS="openblas" \
  VERBOSE=1
cd /mxnet

export PATH="/usr/local/maven/bin:$PATH:/mxnet/lib:/opt/conda/lib"

make scalapkg USE_LAPACK_PATH="/usr/lib/x86_64-linux-gnu" VERBOSE=1
cd /mxnet

sed -i 's#"-mtune=generic"#"-march=sandybridge -mtune=generic"#' CMakeLists.txt

rm -rf build; mkdir build; cd build

cmake \
  -DCMAKE_EXE_LINKER_FLAGS="-L/usr/local/cuda/lib64/stubs" \
  -DCMAKE_SHARED_LINKER_FLAGS="-L/usr/local/cuda/lib64/stubs" \
  -DUSE_OLDCMAKECUDA=ON -DUSE_CUDA=ON -DUSE_NCCL=ON -DCUDA_ARCH_NAME="Manual" \
  -DCUDA_ARCH_BIN="3.7 6.0 6.1 7.0" -DCUDA_ARCH_PTX="3.0 6.0 7.0" \
  -DUSE_SSE=OFF -DBUILD_CPP_EXAMPLES=OFF -DIS_CONTAINER_BUILD=TRUE ..

make -j5 --no-print-directory VERBOSE=1
cd /mxnet/lib
tar -zcf /results/mxnetlibs.tar.gz *
mxnetlibs.tar.gz
cp mxnetlibs.tar.gz /mxnetlibs.tar.gz