I recently purchased an Apple M1 Max with 64GB of RAM as the old 2016 Macbook is slowly dying with the battery giving out and the machine randomly restarting with anything processor intensive. I actually had to send the original M1 I purchased back to Apple after a few days due to a hardware manufacturing issue with the screen flickering. Once the new replacement finally came, one of the tasks at hand was migrating all of my code on the personal computer to the new one and setting up the development environment. It went pretty smoothly until I ran one of my NBA models and noticed that it ran excruciatingly slow. What gives? This computer is a beast but there was a significant performance hit with one of my models running in Python and this computer is light years ahead of the 2016 Macbook Pro so I dove into what was going on.
Here was the time output of the NBA model on the new Apple M1 Max:
python ./nba_model.py 20221022 2196.36s user 642.10s system 385% cpu 12:15.39 total
This ran roughly an order of magnitude slower on the new machine whereas on the old Macbook it would finish in about 2-3 minutes. Clearly something was wrong so I ran htop to see how things were running and after doing some research I began to suspect it was an issue running on Apple Silicon since these new M1 machines are using their own CPU and universal memory architecture instead of Intel’s. There were a lot of threads online but nothing specific about xgboost being slow on M1’s. After some fiddling around and experimentation I was able to figure out that this was a tale of two package managers. Initially I installed xgboost the “correct” Anaconda way:
(base) nickell@mlm1 % conda install xgboost
Collecting package metadata (current_repodata.json): done
Solving environment: done
Package Plan
environment location: /Users/nickell/opt/anaconda3
added / updated specs:
– xgboost
The following NEW packages will be INSTALLED:
_py-xgboost-mutex pkgs/main/osx-64::_py-xgboost-mutex-2.0-cpu_0 None
libxgboost pkgs/main/osx-64::libxgboost-1.5.0-he9d5cce_2 None
py-xgboost pkgs/main/osx-64::py-xgboost-1.5.0-py39hecd8cb5_2 None
xgboost pkgs/main/osx-64::xgboost-1.5.0-py39hecd8cb5_2 None
Proceed ([y]/n)? y
Preparing transaction: done
Verifying transaction: done
Executing transaction: done
Retrieving notices: …working… done
(base) nickell@mlm1 pops %
After accepting that this library install was too slow, I proceeded with the uninstall. Now instead of using Anaconda, let’s try old school pip:
(base) nickell@mlm1 % pip install xgboost
Collecting xgboost
Downloading xgboost-1.6.2-py3-none-macosx_10_15_x86_64.macosx_11_0_x86_64.macosx_12_0_x86_64.whl (1.7 MB)
1.7/1.7 MB 13.0 MB/s eta 0:00:00
Lo and behold, here’s the improved performance time:
python ./nba_model.py 20221022 105.95s user 23.25s system 439% cpu 29.416 total
A performance improvement of 20x. The moral of the story is that if you’re installing xgboost on the new Apple M1 use pip install and not conda install. It would be a good exercise to build from source. My overall takeaway is to keep an eye on the performance of certain python libraries and packages on the M1 as it’s still a relatively new target architecture. Even running PyTorch on M1 GPU’s has only been around since May: https://pytorch.org/blog/introducing-accelerated-pytorch-training-on-mac/. If I run into similar issues with other packages then reinstalling with a different package manager is the first place I’d look.
Leave a Reply