高性能な日本語SPLADE(スパース検索)モデルを公開しました
情報検索モデルで最高性能(512トークン以下)・日本語版SPLADE v2をリリース
git clone https://github.com/ujnak/splade-service
cd splade-service
% git clone https://github.com/ujnak/splade-service
Cloning into 'splade-service'...
remote: Enumerating objects: 15, done.
remote: Counting objects: 100% (15/15), done.
remote: Compressing objects: 100% (13/13), done.
remote: Total 15 (delta 1), reused 4 (delta 1), pack-reused 0 (from 0)
Receiving objects: 100% (15/15), 5.86 KiB | 5.86 MiB/s, done.
Resolving deltas: 100% (1/1), done.
% cd splade-service
splade-service %
Pythonの仮想環境をspladeとして作成し、アクティベートします。Pythonのバージョンは3.13を指定します。
python3.13 -m venv splade
. splade/bin/activate
splade-service % python3.13 -m venv splade
splade-service % . splade/bin/activate
(splade) splade-service %
使用するパッケージをインストールします。
pip install -r requirements.txt
(splade) splade-service % pip install -r requirements.txt
Collecting fastapi>=0.104.0 (from -r requirements.txt (line 1))
Using cached fastapi-0.115.12-py3-none-any.whl.metadata (27 kB)
Collecting uvicorn>=0.24.0 (from uvicorn[standard]>=0.24.0->-r requirements.txt (line 2))
[中略]
Using cached hf_xet-1.1.2-cp37-abi3-macosx_11_0_arm64.whl (2.5 MB)
Using cached idna-3.10-py3-none-any.whl (70 kB)
Using cached MarkupSafe-3.0.2-cp313-cp313-macosx_11_0_arm64.whl (12 kB)
Using cached mpmath-1.3.0-py3-none-any.whl (536 kB)
Using cached urllib3-2.4.0-py3-none-any.whl (128 kB)
Using cached sniffio-1.3.1-py3-none-any.whl (10 kB)
Installing collected packages: unidic-lite, mpmath, websockets, uvloop, urllib3, typing-extensions, tqdm, sympy, sniffio, setuptools, safetensors, regex, pyyaml, python-dotenv, protobuf, packaging, numpy, networkx, MarkupSafe, idna, httptools, hf-xet, h11, fugashi, fsspec, filelock, click, charset-normalizer, certifi, annotated-types, uvicorn, typing-inspection, requests, pydantic-core, jinja2, anyio, watchfiles, torch, starlette, pydantic, huggingface-hub, tokenizers, fastapi, transformers
Successfully installed MarkupSafe-3.0.2 annotated-types-0.7.0 anyio-4.9.0 certifi-2025.4.26 charset-normalizer-3.4.2 click-8.2.1 fastapi-0.115.12 filelock-3.18.0 fsspec-2025.5.1 fugashi-1.4.3 h11-0.16.0 hf-xet-1.1.2 httptools-0.6.4 huggingface-hub-0.32.3 idna-3.10 jinja2-3.1.6 mpmath-1.3.0 networkx-3.5 numpy-2.2.6 packaging-25.0 protobuf-6.31.1 pydantic-2.11.5 pydantic-core-2.33.2 python-dotenv-1.1.0 pyyaml-6.0.2 regex-2024.11.6 requests-2.32.3 safetensors-0.5.3 setuptools-80.9.0 sniffio-1.3.1 starlette-0.46.2 sympy-1.14.0 tokenizers-0.21.1 torch-2.7.0 tqdm-4.67.1 transformers-4.52.4 typing-extensions-4.13.2 typing-inspection-0.4.1 unidic-lite-1.0.8 urllib3-2.4.0 uvicorn-0.34.3 uvloop-0.21.0 watchfiles-1.0.5 websockets-15.0.1
[notice] A new release of pip is available: 25.0.1 -> 25.1.1
[notice] To update, run: pip install --upgrade pip
(splade) splade-service %
python server.py
(splade) splade-service % python server.py
INFO: Started server process [73108]
INFO: Waiting for application startup.
INFO:__main__:Japanese SPLADE v2モデルを読み込み中...
INFO:__main__:モデルの読み込みが完了しました (device: cpu)
INFO: Application startup complete.
INFO: Uvicorn running on http://0.0.0.0:7999 (Press CTRL+C to quit)
splade-service % sh curl-01.sh
{"text":"これは日本語のテストです","sparse_vector":{"indices":[429,4262,5630,12499,12500,12538,13037,13374,13449,13459,13618,13821,14877,14985,22460,24096],"values":[0.10150858014822006,0.17410790920257568,0.27124056220054626,0.49592137336730957,1.5145821571350098,0.7409901022911072,0.228782057762146,0.06052016094326973,0.5196099281311035,0.4933662414550781,0.023301351815462112,0.5448876023292542,0.43785205483436584,1.2762678861618042,0.10169661045074463,0.08509977906942368],"vocab_size":32768}}
splade-service %
CTRL+Cを入力しAPIサーバーを停止します。
(splade) splade-service % podman build --file Dockerfile --tag japanese-splade-v2 .
STEP 1/5: FROM python:3.13
STEP 2/5: WORKDIR /app
--> 6abc32147ceb
STEP 3/5: COPY server.py requirements.txt .
--> 3b79e4f8fa7d
STEP 4/5: RUN pip install -r requirements.txt
Collecting fastapi>=0.104.0 (from -r requirements.txt (line 1))
Downloading fastapi-0.115.12-py3-none-any.whl.metadata (27 kB)
Collecting uvicorn>=0.24.0 (from uvicorn[standard]>=0.24.0->-r requirements.txt (line 2))
Downloading uvicorn-0.34.3-py3-none-any.whl.metadata (6.5 kB)
Collecting transformers>=4.35.0 (from -r requirements.txt (line 3))
Downloading transformers-4.52.4-py3-none-any.whl.metadata (38 kB)
Collecting torch>=2.0.0 (from -r requirements.txt (line 4))
Downloading torch-2.7.0-cp313-cp313-manylinux_2_28_aarch64.whl.metadata (29 kB)
Collecting numpy>=1.24.0 (from -r requirements.txt (line 5))
Downloading numpy-2.2.6-cp313-cp313-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.metadata (63 kB)
Collecting pydantic>=2.0.0 (from -r requirements.txt (line 6))
Downloading pydantic-2.11.5-py3-none-any.whl.metadata (67 kB)
Collecting fugashi>=1.3.0 (from -r requirements.txt (line 7))
Downloading fugashi-1.4.3-cp313-cp313-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.metadata (7.1 kB)
Collecting protobuf>=4.21.0 (from -r requirements.txt (line 8))
Downloading protobuf-6.31.1-cp39-abi3-manylinux2014_aarch64.whl.metadata (593 bytes)
Collecting unidic-lite>=1.0.8 (from -r requirements.txt (line 9))
Downloading unidic-lite-1.0.8.tar.gz (47.4 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 47.4/47.4 MB 40.8 MB/s eta 0:00:00
Installing build dependencies: started
Installing build dependencies: finished with status 'done'
Getting requirements to build wheel: started
Getting requirements to build wheel: finished with status 'done'
Preparing metadata (pyproject.toml): started
Preparing metadata (pyproject.toml): finished with status 'done'
[中略]
Created wheel for unidic-lite: filename=unidic_lite-1.0.8-py3-none-any.whl size=47658912 sha256=4f876a371019546e995b130a47ebadbe7d21e3494b8d7cc631425ce450eac8d6
Stored in directory: /root/.cache/pip/wheels/93/40/f2/23f8a0da599c4174200d54d0b933aedc464a12f1061e4aabba
Successfully built unidic-lite
Installing collected packages: unidic-lite, mpmath, websockets, uvloop, urllib3, typing-extensions, tqdm, sympy, sniffio, setuptools, safetensors, regex, pyyaml, python-dotenv, protobuf, packaging, numpy, networkx, MarkupSafe, idna, httptools, hf-xet, h11, fugashi, fsspec, filelock, click, charset-normalizer, certifi, annotated-types, uvicorn, typing-inspection, requests, pydantic-core, jinja2, anyio, watchfiles, torch, starlette, pydantic, huggingface-hub, tokenizers, fastapi, transformers
Successfully installed MarkupSafe-3.0.2 annotated-types-0.7.0 anyio-4.9.0 certifi-2025.4.26 charset-normalizer-3.4.2 click-8.2.1 fastapi-0.115.12 filelock-3.18.0 fsspec-2025.5.1 fugashi-1.4.3 h11-0.16.0 hf-xet-1.1.2 httptools-0.6.4 huggingface-hub-0.32.3 idna-3.10 jinja2-3.1.6 mpmath-1.3.0 networkx-3.5 numpy-2.2.6 packaging-25.0 protobuf-6.31.1 pydantic-2.11.5 pydantic-core-2.33.2 python-dotenv-1.1.0 pyyaml-6.0.2 regex-2024.11.6 requests-2.32.3 safetensors-0.5.3 setuptools-80.9.0 sniffio-1.3.1 starlette-0.46.2 sympy-1.14.0 tokenizers-0.21.1 torch-2.7.0 tqdm-4.67.1 transformers-4.52.4 typing-extensions-4.13.2 typing-inspection-0.4.1 unidic-lite-1.0.8 urllib3-2.4.0 uvicorn-0.34.3 uvloop-0.21.0 watchfiles-1.0.5 websockets-15.0.1
WARNING: Running pip as the 'root' user can result in broken permissions and conflicting behaviour with the system package manager, possibly rendering your system unusable. It is recommended to use a virtual environment instead: https://pip.pypa.io/warnings/venv. Use the --root-user-action option if you know what you are doing and want to suppress this warning.
[notice] A new release of pip is available: 25.0.1 -> 25.1.1
[notice] To update, run: pip install --upgrade pip
--> 7813f267ac01
STEP 5/5: CMD ["uvicorn", "server:app", "--host", "0.0.0.0", "--port", "7999", "--log-level", "info", "--access-log"]
COMMIT japanese-splade-v2
--> 6db870d05d4c
Successfully tagged localhost/japanese-splade-v2:latest
6db870d05d4c5b1d14d11376eb7555b46705a226efe0305983b9f5ff9c9c4f26
(splade) splade-service %
イメージjapanese-splade-v2が作成されたことを確認します。
(splade) splade-service % podman image ls japanese-splade-v2
REPOSITORY TAG IMAGE ID CREATED SIZE
localhost/japanese-splade-v2 latest 6db870d05d4c 2 minutes ago 2.36 GB
(splade) splade-service %
作成したコンテナ・イメージからコンテナspladeを作成し、実行します。
(splade) splade-service % podman run -d --name splade -p 7999:7999 localhost/japanese-splade-v2
2fe3d3d0128995768069f125ba52ab1adfad5a6ac44074bcbbe5b898cd73e532
(splade) splade-service %
コンテナのログを確認します。Uvicornが開始しリクエストを待ち受けていれば、sparseベクトルの生成ができる状態です。
(splade) splade-service % podman logs -f splade
INFO: Started server process [1]
INFO: Waiting for application startup.
INFO:server:Japanese SPLADE v2モデルを読み込み中...
INFO:server:モデルの読み込みが完了しました (device: cpu)
INFO: Application startup complete.
INFO: Uvicorn running on http://0.0.0.0:7999 (Press CTRL+C to quit)
作成したAPIサーバーを呼び出し表EBAJ_SPARSE_VECTORSの列VECを更新する、また、クエリ文書を入力して、表EBAJ_SPRSE_VECTORSの列との距離を計算するアプリケーションを作成します。エクスポートは以下にあります。
今回もローカルのマシンのコンテナで実行しているAPEX環境にインポートして、アプリケーションを使用します。環境の作成方法については、記事「podmanを使ってOracle Database FreeとOracle REST Data Servicesをコンテナとして実行する」で紹介しています。
アプリケーションをインポートして実行すると、以下のような画面が開きます。
その前に、APIサーバーのエンドポイントをアプリケーションに設定します。
APIサーバーのエンドポイントは、アプリケーション定義の置換文字列G_ENDPOINTとして設定しています。APEXが動作しているコンテナから、コンテナの外のホストに接続するためhost.containers.internalがホスト名になります。
http://host.containers.internal:7999/encode
SELECT id, text,
VECTOR_DISTANCE(vec, VECTOR(:P3_QUERY_VECTOR, 32768, FLOAT32, SPARSE), DOT) as distance
FROM ebaj_sparse_vectors
ORDER BY
VECTOR_DISTANCE(vec, VECTOR(:P3_QUERY_VECTOR, 32768, FLOAT32, SPARSE), DOT)
FETCH FIRST 10 ROWS ONLY;
(splade) splade-service % pip install yasem
Collecting yasem
Using cached yasem-0.4.1-py3-none-any.whl.metadata (4.5 kB)
Requirement already satisfied: numpy>=2.0.0 in ./splade/lib/python3.13/site-packages (from yasem) (2.2.6)
Collecting scipy>=1.13.1 (from yasem)
Using cached scipy-1.15.3-cp313-cp313-macosx_14_0_arm64.whl.metadata (61 kB)
Requirement already satisfied: torch>=2.2.0 in ./splade/lib/python3.13/site-packages (from yasem) (2.7.0)
Requirement already satisfied: transformers>=4.44.0 in ./splade/lib/python3.13/site-packages (from yasem) (4.52.4)
Requirement already satisfied: filelock in ./splade/lib/python3.13/site-packages (from torch>=2.2.0->yasem) (3.18.0)
Requirement already satisfied: typing-extensions>=4.10.0 in ./splade/lib/python3.13/site-packages (from torch>=2.2.0->yasem) (4.13.2)
[中略]
Using cached yasem-0.4.1-py3-none-any.whl (7.3 kB)
Using cached scipy-1.15.3-cp313-cp313-macosx_14_0_arm64.whl (22.4 MB)
Installing collected packages: scipy, yasem
Successfully installed scipy-1.15.3 yasem-0.4.1
[notice] A new release of pip is available: 25.0.1 -> 25.1.1
[notice] To update, run: pip install --upgrade pip
(splade) splade-service %
(splade) splade-service % python token_values.py
{'日本': 1.5146484375, 'テスト': 1.2763671875, 'これ': 0.74072265625, '言語': 0.5458984375, '言葉': 0.5205078125, 'この': 0.496337890625, '試験': 0.49267578125, '検査': 0.4375, '語': 0.2705078125, 'です': 0.2301025390625, '私': 0.1734619140625, 'か': 0.1029052734375, 'みたい': 0.10205078125, 'わかり': 0.08514404296875, 'ここ': 0.0596923828125, '種類': 0.0241241455078125}
(splade) splade-service %