AMDのNPU(RyzenAI XDNA)を使ってみた

AI

新しくLinux開発機を組み立てたが、せっかくだしWindow11をインストールしてRyzenAIを動かしてみる。

Ryzen8000シリーズの8600GでもRyzenAI動作したので記録する。

インストール手順 — Ryzen AI Software 1.1 ドキュメント (amd.com)

RyzenAIのドキュメント通りにセットアップすれば動作することができたので記録する。

ドライバのインストール

.\amd_install_kipudrv.bat

必要なソフトウェアのインストール

それぞれ使用したリンクを記録する。

Visual Studio 2019

前のバージョンの Visual Studio のダウンロード – 2019、2017、2015 以前のバージョン (microsoft.com))

CMake

https://github.com/Kitware/CMake/releases/download/v3.29.0/cmake-3.29.0-windows-x86_64.msi

Python3.11

Python Release Python 3.11.8
The official home of the Python Programming Language

Anaconda3

https://repo.anaconda.com/archive/Anaconda3-2024.02-1-Windows-x86_64.exe

CMakeとAnacondaはPathを通す設定のチェックボックスを有効にすること

インストールファイルを実行する下記のログが得られる。

PS C:\Users\suzuxi\Downloads\ryzen-ai-sw-1.1\ryzen-ai-sw-1.1> .\install.bat
Windows 11: OK
Visual Studio 2019: OK
Python: OK
CONDA Available: OK
CMake: OK
IPU driver Available: OK
All deps are available. Proceeding to Conda env creation...
Do you accept EULA for RyzenAI? [y/n]: y
Ran pip subprocess with arguments:
['C:\\Users\\suzuxi\\anaconda3\\envs\\ryzenai-1.1-20240401-191621\\python.exe', '-m', 'pip', 'install', '-U', '-r', 'C:\\Users\\suzu]
Pip subprocess output:
Processing c:\users\suzuxi\downloads\ryzen-ai-sw-1.1\ryzen-ai-sw-1.1\vai_q_onnx-1.16.0+69bc4f2-py2.py3-none-any.whl (from -r C:\User)
Collecting onnxruntime==1.15.1 (from -r C:\Users\suzuxi\Downloads\ryzen-ai-sw-1.1\ryzen-ai-sw-1.1\condaenv.ld9kx2h3.requirements.txt)
  Obtaining dependency information for onnxruntime==1.15.1 from https://files.pythonhosted.org/packages/6a/fb/99bc0e75f3d23eab0dda64a
  Downloading onnxruntime-1.15.1-cp39-cp39-win_amd64.whl.metadata (4.1 kB)
Collecting coloredlogs (from onnxruntime==1.15.1->-r C:\Users\suzuxi\Downloads\ryzen-ai-sw-1.1\ryzen-ai-sw-1.1\condaenv.ld9kx2h3.req)
  Obtaining dependency information for coloredlogs from https://files.pythonhosted.org/packages/a7/06/3d6badcf13db419e25b07041d9c7b4a
  Downloading coloredlogs-15.0.1-py2.py3-none-any.whl.metadata (12 kB)
Collecting flatbuffers (from onnxruntime==1.15.1->-r C:\Users\suzuxi\Downloads\ryzen-ai-sw-1.1\ryzen-ai-sw-1.1\condaenv.ld9kx2h3.req)
  Obtaining dependency information for flatbuffers from https://files.pythonhosted.org/packages/41/f0/7e988a019bc54b2dbd0ad4182ef2d5a
  Downloading flatbuffers-24.3.25-py2.py3-none-any.whl.metadata (850 bytes)
Collecting numpy>=1.21.6 (from onnxruntime==1.15.1->-r C:\Users\suzuxi\Downloads\ryzen-ai-sw-1.1\ryzen-ai-sw-1.1\condaenv.ld9kx2h3.r)
  Obtaining dependency information for numpy>=1.21.6 from https://files.pythonhosted.org/packages/b5/42/054082bd8220bbf6f297f982f0a8a
  Downloading numpy-1.26.4-cp39-cp39-win_amd64.whl.metadata (61 kB)
     ---------------------------------------- 61.0/61.0 kB 1.6 MB/s eta 0:00:00
Collecting packaging (from onnxruntime==1.15.1->-r C:\Users\suzuxi\Downloads\ryzen-ai-sw-1.1\ryzen-ai-sw-1.1\condaenv.ld9kx2h3.requi)
  Obtaining dependency information for packaging from https://files.pythonhosted.org/packages/49/df/1fceb2f8900f8639e278b056416d4913a
  Downloading packaging-24.0-py3-none-any.whl.metadata (3.2 kB)
Collecting protobuf (from onnxruntime==1.15.1->-r C:\Users\suzuxi\Downloads\ryzen-ai-sw-1.1\ryzen-ai-sw-1.1\condaenv.ld9kx2h3.requir)
  Obtaining dependency information for protobuf from https://files.pythonhosted.org/packages/dc/1c/770724a1bdf5b28712be960bf5e6f47c8a
  Downloading protobuf-5.26.1-cp39-cp39-win_amd64.whl.metadata (592 bytes)
Collecting sympy (from onnxruntime==1.15.1->-r C:\Users\suzuxi\Downloads\ryzen-ai-sw-1.1\ryzen-ai-sw-1.1\condaenv.ld9kx2h3.requireme)
  Obtaining dependency information for sympy from https://files.pythonhosted.org/packages/d2/05/e6600db80270777c4a64238a98d442f0fd07a
  Downloading sympy-1.12-py3-none-any.whl.metadata (12 kB)
Collecting onnx>1.12.0 (from vai-q-onnx==1.16.0+69bc4f2->-r C:\Users\suzuxi\Downloads\ryzen-ai-sw-1.1\ryzen-ai-sw-1.1\condaenv.ld9kx)
  Obtaining dependency information for onnx>1.12.0 from https://files.pythonhosted.org/packages/06/45/ad7485c677edb810dbfc2a8ab58941a
  Downloading onnx-1.16.0-cp39-cp39-win_amd64.whl.metadata (16 kB)
Collecting onnxruntime-extensions (from vai-q-onnx==1.16.0+69bc4f2->-r C:\Users\suzuxi\Downloads\ryzen-ai-sw-1.1\ryzen-ai-sw-1.1\con)
  Obtaining dependency information for onnxruntime-extensions from https://files.pythonhosted.org/packages/f5/8c/5e23b742056f05d9114a
  Downloading onnxruntime_extensions-0.10.1-cp39-cp39-win_amd64.whl.metadata (4.5 kB)
Collecting tqdm (from vai-q-onnx==1.16.0+69bc4f2->-r C:\Users\suzuxi\Downloads\ryzen-ai-sw-1.1\ryzen-ai-sw-1.1\condaenv.ld9kx2h3.req)
  Obtaining dependency information for tqdm from https://files.pythonhosted.org/packages/2a/14/e75e52d521442e2fcc9f1df3c5e456aead034a
  Downloading tqdm-4.66.2-py3-none-any.whl.metadata (57 kB)
     ---------------------------------------- 57.6/57.6 kB 3.0 MB/s eta 0:00:00
Collecting rich (from vai-q-onnx==1.16.0+69bc4f2->-r C:\Users\suzuxi\Downloads\ryzen-ai-sw-1.1\ryzen-ai-sw-1.1\condaenv.ld9kx2h3.req)
  Obtaining dependency information for rich from https://files.pythonhosted.org/packages/87/67/a37f6214d0e9fe57f6ae54b2956d550ca8365a
  Downloading rich-13.7.1-py3-none-any.whl.metadata (18 kB)
Collecting humanfriendly>=9.1 (from coloredlogs->onnxruntime==1.15.1->-r C:\Users\suzuxi\Downloads\ryzen-ai-sw-1.1\ryzen-ai-sw-1.1\c)
  Obtaining dependency information for humanfriendly>=9.1 from https://files.pythonhosted.org/packages/f0/0f/310fb31e39e2d734ccaa2c0a
  Downloading humanfriendly-10.0-py2.py3-none-any.whl.metadata (9.2 kB)
2024-04-01 19:17:14,647 - INFO - copying C:\Windows\System32\AMD\xrt_core.dll to C:\Users\suzuxi\anaconda3\envs\ryzenai-1.1-20240401-191621\lib\site-packages\onnxruntime\capi
2024-04-01 19:17:14,655 - INFO - copying C:\Windows\System32\AMD\xrt_coreutil.dll to C:\Users\suzuxi\anaconda3\envs\ryzenai-1.1-20240401-191621\lib\site-packages\onnxruntime\capi
2024-04-01 19:17:14,666 - INFO - copying C:\Windows\System32\AMD\amd_xrt_core.dll to C:\Users\suzuxi\anaconda3\envs\ryzenai-1.1-20240401-191621\lib\site-packages\onnxruntime\capi
2024-04-01 19:17:14,674 - INFO - copying C:\Windows\System32\AMD\xdp_ml_timeline_plugin.dll to C:\Users\suzuxi\anaconda3\envs\ryzenai-1.1-20240401-191621\lib\site-packages\onnxruntime\capi
2024-04-01 19:17:14,679 - INFO - copying C:\Windows\System32\AMD\xdp_core.dll to C:\Users\suzuxi\anaconda3\envs\ryzenai-1.1-20240401-191621\lib\site-packages\onnxruntime\capi
2024-04-01 19:17:14,685 - INFO - copying C:\Users\suzuxi\Downloads\ryzen-ai-sw-1.1\ryzen-ai-sw-1.1\voe-4.0-win_amd64\..\onnxruntime\bin\onnxruntime.dll to C:\Users\suzuxi\anaconda3\envs\ryzenai-1.1-20240401-191621\lib\site-packages\onnxruntime\capi

Setting RYZEN_AI_INSTALLER env variable ...
Setting XLNX_VART_FIRMWARE env variable ...
Created conda env: ryzenai-1.1-20240401-191621

動作確認

Anaconda Navigatorからログの最後に出力している仮想環境( ryzenai-1.1-20240401-191621)を立ち上げる

Anaconda Pronptから以下を実行し仮想環境を有効にする

conda activate ryzenai-1.1-20240401-191621

quicktest.pyを実行

~\ryzen-ai-sw-1.1\ryzen-ai-sw-1.1\quicktest>python quicktest.py
(ryzenai-1.1-20240401-191621) C:\Users\suzuxi\Downloads\ryzen-ai-sw-1.1\ryzen-ai-sw-1.1\quicktest>python quicktest.py
2 0 2 4 - 0 4 - 0 1   1 9 : 3 0 : 2 7 . 1 4 1 1 5 5 6   [ W : o n n x r u n t i m e : D e f a u l t ,   v i t i s a i _ p r o v i d e r _ f a c t o r y . c c : 4 8   o n n x r u n t i m e : : V i t i s A I P r o v i d e r F a c t o r y : : C r e a t e P r o v i d e r ]   C o n s t r u t i n g   a   F l e x M L   E P   i n s t a n c e   i n   V i t i s   A I   E P
 2 0 2 4 - 0 4 - 0 1   1 9 : 3 0 : 2 7 . 1 4 6 0 6 3 0   [ W : o n n x r u n t i m e : D e f a u l t ,   v i t i s a i _ e x e c u t i o n _ p r o v i d e r . c c : 1 1 7   o n n x r u n t i m e : : V i t i s A I E x e c u t i o n P r o v i d e r : : S e t F l e x M L E P P t r ]   A s s i g n i n g   t h e   F l e x M L   E P   p o i n t e r   i n   V i t i s   A I   E P
 2 0 2 4 - 0 4 - 0 1   1 9 : 3 0 : 2 7 . 1 9 9 8 9 1 1   [ W : o n n x r u n t i m e : D e f a u l t ,   v i t i s a i _ e x e c u t i o n _ p r o v i d e r . c c : 1 3 7   o n n x r u n t i m e : : V i t i s A I E x e c u t i o n P r o v i d e r : : G e t C a p a b i l i t y ]   T r y i n g   F l e x M L   E P   G e t C a p a b i l i t y
 2 0 2 4 - 0 4 - 0 1   1 9 : 3 0 : 2 7 . 2 0 5 4 6 1 4   [ W : o n n x r u n t i m e : D e f a u l t ,   f l e x m l _ e x e c u t i o n _ p r o v i d e r . c c : 1 8 0   o n n x r u n t i m e : : F l e x M L E x e c u t i o n P r o v i d e r : : G e t C a p a b i l i t y ]   F l e x M L E x e c u t i o n P r o v i d e r : : G e t C a p a b i l i t y ,   C : \ a m d \ v o e \ b i n a r y - m o d u l e s \ R e s N e t . f l e x m l \ f l e x m l _ b m . s i g n a t u r e   c a n ' t   n o t   b e   f o u n d !
 2 0 2 4 - 0 4 - 0 1   1 9 : 3 0 : 2 7 . 2 1 3 6 5 7 1   [ W : o n n x r u n t i m e : D e f a u l t ,   v i t i s a i _ e x e c u t i o n _ p r o v i d e r . c c : 1 5 3   o n n x r u n t i m e : : V i t i s A I E x e c u t i o n P r o v i d e r : : G e t C a p a b i l i t y ]   F l e x M L   E P   i g n o r i n g   a   n o n - R e s N e t 5 0   g r a p h
 WARNING: Logging before InitGoogleLogging() is written to STDERR
I20240401 19:30:27.228062 15492 vitisai_compile_model.cpp:346] Vitis AI EP Load ONNX Model Success
I20240401 19:30:27.229064 15492 vitisai_compile_model.cpp:347] Graph Input Node Name/Shape (1)
I20240401 19:30:27.229064 15492 vitisai_compile_model.cpp:351]   input : [-1x3x32x32]
I20240401 19:30:27.229064 15492 vitisai_compile_model.cpp:357] Graph Output Node Name/Shape (1)
I20240401 19:30:27.230569 15492 vitisai_compile_model.cpp:361]   output : [-1x10]
I20240401 19:30:27.232574 15492 vitisai_compile_model.cpp:232] use cache key modelcachekey_quick
I20240401 19:30:32.836517 15492 compile_pass_manager.cpp:352] Compile mode: aie
I20240401 19:30:32.836517 15492 compile_pass_manager.cpp:353] Debug mode: performance
I20240401 19:30:32.836517 15492 compile_pass_manager.cpp:357] Target architecture: AMD_AIE2_Nx4_Overlay
I20240401 19:30:32.838519 15492 compile_pass_manager.cpp:606] Graph name: main_graph, with op num: 438
I20240401 19:30:32.838519 15492 compile_pass_manager.cpp:619] Begin to compile...
0
W20240401 19:30:38.018899 15492 RedundantOpReductionPass.cpp:664] xir::Op{name = /avgpool/GlobalAveragePool_output_0_DequantizeLinear_Output_vaip_315, type = pool-fix}'s input and output is unchanged, so it will be removed.
I20240401 19:30:38.186391 15492 PartitionPass.cpp:6479] xir::Op{name = output_, type = fix2float} is not supported by current target. Target name: AMD_AIE2_Nx4_Overlay, target type: IPU_PHX. Assign it to CPU.
I20240401 19:30:39.817389 15492 compile_pass_manager.cpp:643] Total device subgraph number 3, USER subgraph number 1
I20240401 19:30:39.817389 15492 compile_pass_manager.cpp:645] Total device subgraph number 3, CPU subgraph number 1
I20240401 19:30:39.817389 15492 compile_pass_manager.cpp:665] Total device subgraph number 3, DPU subgraph number 1, total number of PDI swaps 0
I20240401 19:30:39.817389 15492 compile_pass_manager.cpp:721] Compile done.
I20240401 19:30:39.880322 15492 anchor_point.cpp:443] before optimization:

input_DequantizeLinear_Output <-- identity@ --
input_QuantizeLinear_Output <-- identity@fuse_DPU --
input_QuantizeLinear_Output
after optimization:

input_QuantizeLinear_Output_vaip_426 <-- identity@combine_empty --
input_QuantizeLinear_Output
I20240401 19:30:39.881322 15492 anchor_point.cpp:443] before optimization:

output <-- identity@ --
output_QuantizeLinear_Output <-- identity@fuse_DPU --
output_QuantizeLinear_Output
after optimization:

output_QuantizeLinear_Output_vaip_427 <-- identity@combine_empty --
output_QuantizeLinear_Output
[Vitis AI EP] No. of Operators :   CPU     2    IPU   398  99.50%
[Vitis AI EP] No. of Subgraphs :   CPU     1    IPU     1 Actually running on IPU     1
2 0 2 4 - 0 4 - 0 1   1 9 : 3 0 : 4 0 . 0 0 8 6 5 9 6   [ W : o n n x r u n t i m e : ,   s e s s i o n _ s t a t e . c c : 1 1 6 9   o n n x r u n t i m e : : V e r i f y E a c h N o d e I s A s s i g n e d T o A n E p ]   S o m e   n o d e s   w e r e   n o t   a s s i g n e d   t o   t h e   p r e f e r r e d   e x e c u t i o n   p r o v i d e r s   w h i c h   m a y   o r   m a y   n o t   h a v e   a n   n e g a t i v e   i m p a c t   o n   p e r f o r m a n c e .   e . g .   O R T   e x p l i c i t l y   a s s i g n s   s h a p e   r e l a t e d   o p s   t o   C P U   t o   i m p r o v e   p e r f .
 2 0 2 4 - 0 4 - 0 1   1 9 : 3 0 : 4 0 . 0 1 6 5 8 8 6   [ W : o n n x r u n t i m e : ,   s e s s i o n _ s t a t e . c c : 1 1 7 1   o n n x r u n t i m e : : V e r i f y E a c h N o d e I s A s s i g n e d T o A n E p ]   R e r u n n i n g   w i t h   v e r b o s e   o u t p u t   o n   a   n o n - m i n i m a l   b u i l d   w i l l   s h o w   n o d e   a s s i g n m e n t s .
 Test Passed
2 0 2 4 - 0 4 - 0 1   1 9 : 3 0 : 4 0 . 9 5 9 3 9 6 1   [ W : o n n x r u n t i m e : D e f a u l t ,   v i t i s a i _ e x e c u t i o n _ p r o v i d e r . c c : 7 4   o n n x r u n t i m e : : V i t i s A I E x e c u t i o n P r o v i d e r : : ~ V i t i s A I E x e c u t i o n P r o v i d e r ]   R e l e a s i n g   t h e   F l e x M L   E P   p o i n t e r   i n   V i t i s   A I   E P

”Test Passed”が表示されていれば無事に起動テスト完了

先にLinuxでRyzenAI環境を立ち上げて挫折した息抜きに、Windowsで環境を作成したら簡単にできた…
Linuxは後日失敗までを記事にしたいと思います….

とりあえず今後はWindowsでXDNAを遊んでみます

次の記事はこちら

コメント

タイトルとURLをコピーしました