Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Support for Local LLama Models via LlamaCpp Integration #781

Open
wants to merge 1 commit into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
7 changes: 6 additions & 1 deletion .env.sample
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
# OPENAI or AZURE or ANTHROPIC
# OPENAI or AZURE or ANTHROPIC or LLAMA
ENDPOINT=OPENAI

# USE when ENDPOINT=OPENAI
Expand All @@ -16,6 +16,11 @@ AZURE_OPENAI_ENDPOINT=https://<your-azure-openai-endpoint>.openai.azure.com/
ANTHROPIC_API_KEY=<your-anthropic-api-key>
ANTHROPIC_API_MODEL=<your-anthropic-api-name>

# USE when ENDPOINT=LLAMA
LLAMA_MODEL_PATH=/path/to/llama/model.gguf
LLAMA_N_CTX=2048
LLAMA_N_GPU_LAYERS=0

# LangSmith
LANGCHAIN_TRACING_V2=false
LANGCHAIN_ENDPOINT=https://api.smith.langchain.com
Expand Down
28 changes: 28 additions & 0 deletions gpt_all_star/core/llm.py
Original file line number Diff line number Diff line change
Expand Up @@ -5,12 +5,16 @@
from langchain_anthropic import ChatAnthropic
from langchain_core.language_models.chat_models import BaseChatModel
from langchain_openai import AzureChatOpenAI, ChatOpenAI
from langchain_community.llms import LlamaCpp
from langchain.callbacks.manager import CallbackManager
from langchain.callbacks.streaming_stdout import StreamingStdOutCallbackHandler


class LLM_TYPE(str, Enum):
OPENAI = "OPENAI"
AZURE = "AZURE"
ANTHROPIC = "ANTHROPIC"
LLAMA = "LLAMA"
Comment on lines +8 to +17

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Image description CodeRabbit

新たにLlamaCppのインポートとLLM_TYPE列挙型にLLAMAオプションを追加しています。これにより、LLamaモデルのサポートが追加されます。



def create_llm(llm_name: LLM_TYPE) -> BaseChatModel:
Expand All @@ -37,6 +41,13 @@ def create_llm(llm_name: LLM_TYPE) -> BaseChatModel:
model_name=os.getenv("ANTHROPIC_API_MODEL", "claude-3-opus-20240229"),
temperature=0.1,
)
elif llm_name == LLM_TYPE.LLAMA:
return _create_chat_llama(
model_path=os.getenv("LLAMA_MODEL_PATH"),
temperature=0.1,
n_ctx=int(os.getenv("LLAMA_N_CTX", "2048")),
n_gpu_layers=int(os.getenv("LLAMA_N_GPU_LAYERS", "0")),
)
Comment on lines +44 to +50

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Image description CodeRabbit

create_llm()関数が更新され、LLM_TYPE.LLAMAの場合に_create_chat_llama()関数を呼び出すようになりました。これにより、LLamaモデルの作成が可能になります。

else:
raise ValueError(f"Unsupported LLM type: {llm_name}")

Expand Down Expand Up @@ -83,3 +94,20 @@ def _create_chat_anthropic(
temperature=temperature,
streaming=True,
)


def _create_chat_llama(
model_path: str,
temperature: float,
n_ctx: int = 2048,
n_gpu_layers: int = 0,
) -> LlamaCpp:
callback_manager = CallbackManager([StreamingStdOutCallbackHandler()])
return LlamaCpp(
model_path=model_path,
temperature=temperature,
n_ctx=n_ctx,
n_gpu_layers=n_gpu_layers,
callback_manager=callback_manager,
verbose=True,
)
Comment on lines +99 to +113

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Image description CodeRabbit

新たに_create_chat_llama()関数が作成されました。この関数は、LLamaモデルの初期化を行い、LlamaCppインスタンスを返します。

38 changes: 37 additions & 1 deletion tests/core/test_llm.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@

import pytest

from gpt_all_star.core.llm import _create_chat_openai
from gpt_all_star.core.llm import _create_chat_openai, _create_chat_llama


@pytest.fixture
Expand All @@ -17,6 +17,18 @@ def mock_chat_openai():
yield mock


@pytest.fixture
def mock_llamacpp():
with patch("gpt_all_star.core.llm.LlamaCpp") as mock:
yield mock


@pytest.fixture
def mock_callback_manager():
with patch("gpt_all_star.core.llm.CallbackManager") as mock:
yield mock
Comment on lines +20 to +29

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Image description CodeRabbit

新たにmock_llamacppmock_callback_managerという2つのfixtureが追加されています。これらは、LlamaCppCallbackManagerのモックを作成し、テストで使用します。問題はありません。



def test_create_chat_openai_with_base_url(mock_openai, mock_chat_openai):
base_url = "https://custom-openai-api.com/v1"
_create_chat_openai(model_name="gpt-4", temperature=0.1, base_url=base_url)
Expand All @@ -40,3 +52,27 @@ def test_create_chat_openai_without_base_url(mock_openai, mock_chat_openai):
client=mock_openai.chat.completions,
openai_api_base=None,
)


def test_create_chat_llama(mock_llamacpp, mock_callback_manager):
model_path = "/path/to/model.gguf"
temperature = 0.1
n_ctx = 2048
n_gpu_layers = 0

_create_chat_llama(
model_path=model_path,
temperature=temperature,
n_ctx=n_ctx,
n_gpu_layers=n_gpu_layers,
)

mock_callback_manager.assert_called_once()
mock_llamacpp.assert_called_once_with(
model_path=model_path,
temperature=temperature,
n_ctx=n_ctx,
n_gpu_layers=n_gpu_layers,
callback_manager=mock_callback_manager.return_value,
verbose=True,
)
Comment on lines +57 to +78

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Image description CodeRabbit

新たにtest_create_chat_llamaというテスト関数が追加されています。この関数では、_create_chat_llama関数が期待通りに動作するかを検証しています。具体的には、LlamaCppCallbackManagerが適切に呼び出され、正しいパラメータで初期化されることを確認しています。問題はありません。

Loading