Managing ARM vs X86 Python on M1 Macs 

Problem: Intel Optimisations now break x86 emulation support for common Python data science libraries (PyTorch, MatplotLib, ScikitLearn) on newer M1 Apple Macs

See this GIST on GitHub
https://gist.github.com/CINJ/6e6dbd0eefad11a69dd1c7308b36359c

With a somewhat complex set of data science and database related dependencies (MatPlotLib, ScikitLearn, and PyTorch amongst others) I initially I came across this error: 

Error #15: Initializing libiomp5.dylib, but found libiomp5.dylib already initialized OMP: Hint: This means that multiple copies of the OpenMP runtime have been linked into the program. That is dangerous, since it can degrade performance or cause incorrect results. The best thing to do is to ensure that only a single OpenMP runtime is linked into the process, e.g. by avoiding static linking of the OpenMP runtime in any library. As an unsafe, unsupported, undocumented workaround you can set the environment variable KMP_DUPLICATE_LIB_OK=TRUE to allow the program to continue to execute, but that may cause crashes or silently produce incorrect results. For more information, please see http://www.intel.com/software/products/support/.

The solutions I found when searching on Google were primarily around forcing the Conda environment to use X86 versions of the dependencies assuming Rosetta2 emulation. Upon doing this I found issues with Intel MKL as per the error below:

Intel MKL FATAL ERROR: This system does not meet the minimum requirements for use of the Intel(R) Math Kernel Library. The processor must support the Intel(R) Supplemental Streaming SIMD Extensions 3 (Intel(R) SSSE3) instructions. The processor must support the Intel(R) Streaming SIMD Extensions 4.2 (Intel(R) SSE4.2) instructions. The processor must support the Intel(R) Advanced Vector Extensions (Intel(R) AVX) instructions.

Bottom line is that while all the solutions were focused on ensuring there was a single architecture being used, it still did not resolve my issues. It seems like as more versions of the common data science libraries have been migrated to Apple silicon, more platform specific optimisations have also been added to the mix causing architecture emulation to break.

I realised the best solution was not to use compatibility, rather, force the Conda installation to use M1 specific ARM builds for everything. It has been nearly two years since M1 was introduced, so we should be safe in doing this anyway - and using the bash script below I was proven correct.

You can easily modify the script below to:

  • use plain Conda instead of (the vastly superior) Mamba
  • change where the yml file is expected to be
  • change your 'forced' architecture (I can't think of other environments where this emulation problem would be an issue so I only made the script support M1)

The expectation is that you understand the format of Conda dependency YML files. See https://stackoverflow.com/questions/44742138/how-to-create-a-conda-environment-based-on-a-yaml-file for more details.

env.sh

#!/bin/bash
# (C)2022 J.Cincotta
# Version 5.2
#
# Usage:
#
# Create conda environment
# env.sh 
#
# Update conda environment
# env.sh -u 
#
# Delete conda environment
# env.sh -d 
#
# Assumes that there is a yml file in the same directory as the env.sh script
# this file should be named the same as the environment. For example, if my
# environment name is "my_environment" then it expects "my_environment.yml"
# works with Linux for x86 and ARM and MacOS x86 and M1

DIR="$( cd "$( dirname "${BASH_SOURCE[0]}" )" >/dev/null 2>&1 && pwd )"
ARCH=$(arch)
OS=$(uname)

# ARCH Values:
#   aarch64 : Linux ARM
#   arm64   : M1
#   x86_64  : X86 Linux or Mac
# OS Values:
#   Linux   : Linux
#   Darwin  : Mac
# Available Conda Architecture Dependency Paths:
# linux-64
# linux-aarch64
# linux-ppc64le
# osx-64
# osx-arm64
# win-64

COMPAT=0

# Force Conda to use M1 ARM builds for everything
COMARCH="osx-arm64"

if [ "$ARCH" == "arm64" ] && [ "$OS" == "Darwin" ];
then
  echo "Compatability mode enabled for M1 Mac"
  COMPAT=1
fi


if [ "$1" == "-delete" ] || [ "$1" == "-d" ];
then
  conda remove --name $2 --all
  exit 0
fi

if [ "$1" == "-update" ] || [ "$1" == "-u" ];
then
  CONDENV="$2"
  eval "$(conda shell.bash hook)"
  conda activate ${CONDENV}
  if [ "$COMPAT" == "1" ];
  then
    conda config --env --set subdir $COMARCH
    CONDA_SUBDIR=$COMARCH mamba env update -f "${DIR}/${CONDENV}.yml"
  else
    mamba env update -f "${DIR}/${CONDENV}.yml"
  fi
else
  CONDENV="$1"
  echo "Install Mamba"
  conda install -y mamba -n base -c conda-forge
  echo "Create ${CONDENV}"
  if [ "$COMPAT" == "1" ];
  then
    CONDA_SUBDIR=$COMARCH mamba env create -f "${DIR}/${CONDENV}.yml"
    eval "$(conda shell.bash hook)"
    conda activate ${CONDENV}
    conda config --env --set subdir $COMARCH
  else
    mamba env create -f "${DIR}/${CONDENV}.yml"
    eval "$(conda shell.bash hook)"
    conda activate ${CONDENV}
  fi
fi