Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FEAT: GPTFuzzer Orchestrator #226

Closed
wants to merge 0 commits into from
Closed

FEAT: GPTFuzzer Orchestrator #226

wants to merge 0 commits into from

Conversation

gseetha04
Copy link
Contributor

Description

Adding a new Orchestrator based on GPTFuzzer paper which uses MCTS algorithm to select a jailbreak template, apply prompt converter and send it to the target to get a response.

Implemented the MCTS algorithm for the seed selection

@gseetha04
Copy link
Contributor Author

gseetha04 commented May 29, 2024

@microsoft-github-policy-service agree company="Centific"

scored_response.append(
self._scorer.score_async(response))

batch_scored_response = await asyncio.gather(*scored_response)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could be a lot. Maybe a batch size would help. With more than a few you'll just overwhelm the scoring target leading to failures. For batching we usually use a method on the normalizer, but the scorer doesn't have that yet if I remember correctly. Perhaps the batching logic itself should move to the scorer to have that batch method available and you can just call it from here and not worry about batching in an orchestrator. Cc @rlundeen2

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that #331 is adding batch scoring which we can use here! Hooray!

Let's just leave this comment open until #331 is merged.

@romanlutz romanlutz linked an issue Jul 23, 2024 that may be closed by this pull request
Copy link
Contributor

@romanlutz romanlutz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mostly tiny things but wanted to be thorough since we're getting close.

@romanlutz romanlutz changed the title [DRAFT] FEAT: GPTFuzzer Orchestrator FEAT: GPTFuzzer Orchestrator Aug 30, 2024
@romanlutz romanlutz marked this pull request as ready for review August 30, 2024 18:45
Comment on lines 110 to 121
with patch.object(fuzzer_orchestrator, '_select' ) as mock_get_seed:
mock_get_seed.return_value = prompt_node # return a promptnode
with patch.object(fuzzer_orchestrator,'_apply_template_converter') as mock_apply_template_converter:
mock_apply_template_converter.return_value = prompt_node #return_value
with patch.object(fuzzer_orchestrator,'_update') as mock_update:
fuzzer_orchestrator._prompt_normalizer = AsyncMock()
fuzzer_orchestrator._prompt_normalizer.send_prompt_batch_to_target_async = AsyncMock(return_value=prompt_target_response) #return_value
fuzzer_orchestrator._scorer = AsyncMock()

fuzzer_orchestrator._scorer.score_async = AsyncMock( # type: ignore
side_effect =[[false_score] * (rounds-1) * len(simple_prompts.prompts) + [true_score] * len(simple_prompts.prompts) ] #score2, score2,score2, score2,score1
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Something is off here.
As soon as you're outside the with-block the context is gone, so these mock objects are gone.
They should be nested I think (?)

Plus, you're defining a side_effect in the last lines and not using it. Where are we actually calling execute_fuzzer?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

FEAT add fuzzer orchestrator
2 participants