LLM Parameters 를 解码하는 것, 部分 2: Top-P (Nucleus Sampling)

LLM パラメーター

どの機械学習モデルも同様に、大規模な言語モデルには、生成されるテキストの変异性を制御するための様々なパラメーターがあります。私たちは、これらのパラメーターの影響を詳細に説明するために、多部分のシリーズを開始しました。私たちは、この多部分のシリーズで議論したすべてのパラメーターを使用して、コンテンツ生成に最適なバランスを取るときに到着するでしょう。

これは第2部分です。ここで、もう一つ有名なパラメーター、”Top-P”について話します。

Top-P (Nucleus Sampling)

モデルの出力の多様性を制御するためには、Top-Pが適しています。低いTop-Pはモデルに最も可能性の高い単語を使用するように促し、高いTop-Pはモデルにより多様な単語を使用するように促し、創造性を増やします。

次のコードと出力を使ってTop-Pの动作を見てみましょう。

Python

import torch

from transformers import GPT2LMHeadModel, GPT2Tokenizer

​

# Load GPT-2 model and tokenizer

tokenizer = GPT2Tokenizer.from_pretrained("gpt2")

model = GPT2LMHeadModel.from_pretrained("gpt2")

​

# Add pad token to tokenizer (GPT-2 doesn't have it by default)

tokenizer.pad_token = tokenizer.eos_token

​

# Function to generate response with varying top_p

def generate_with_top_p(prompt, top_p):

    inputs = tokenizer(prompt, return_tensors='pt', padding=True)

​

    # Set the attention_mask and pad_token_id

    outputs = model.generate(

        inputs.input_ids,

        attention_mask=inputs['attention_mask'],

        do_sample=True,

        max_length=200, 

        top_p=top_p,

        pad_token_id=tokenizer.eos_token_id

    )

    return tokenizer.decode(outputs[0], skip_special_tokens=True)

​

​

prompt = "What are some effective ways to manage stress in daily life?"

​

# List of top-p values and their descriptions

top_p_values = {

    0.1: "Very conservative: Generates highly probable and safe responses.",

    0.3: "Conservative: Generates probable responses with less risk.",

    0.5: "Balanced: A mix of safe and creative responses.",

    0.7: "Creative: Generates more diverse and creative responses.",

    0.9: "Very creative: Allows for highly diverse and less probable responses."

}

​

# Test top_p variations

for top_p, description in top_p_values.items():

    print(f"\nTop-p {top_p} ({description}):\n")

    print(generate_with_top_p(prompt, top_p=top_p))

​

出力:

PowerShell

python test_top_p.py

​

Top-p 0.1 (Very conservative: Generates highly probable and safe responses.):

​

What are some effective ways to manage stress in daily life?

​

The following are some of the most common ways to manage stress in daily life.

​

1. Avoiding stress

​

The most common way to avoid stress is to avoid it.

​

The most common way to avoid stress is to avoid it.

​

2. Avoiding stress

​

The most common way to avoid stress is to avoid it.

​

The most common way to avoid stress is to avoid it.

​

3. Avoiding stress

​

The most common way to avoid stress is to avoid it.

​

The most common way to avoid stress is to avoid it.

​

4. Avoiding stress

​

The most common way to avoid stress is to avoid it.

​

The most common way to avoid stress is to avoid it.

​

5. Avoiding stress

​

The most common way to avoid stress is to avoid it.

​

The most common way to avoid stress

​

Top-p 0.3 (Conservative: Generates probable responses with less risk.):

​

What are some effective ways to manage stress in daily life?

​

What are some effective ways to manage stress in daily life?

​

What are some effective ways to manage stress in daily life?

​

What are some effective ways to manage stress in daily life?

​

What are some effective ways to manage stress in daily life?

​

What are some effective ways to manage stress in daily life?

​

What are some effective ways to manage stress in daily life?

​

What are some effective ways to manage stress in daily life?

​

What are some effective ways to manage stress in daily life?

​

What are some effective ways to manage stress in daily life?

​

What are some effective ways to manage stress in daily life?

​
What are some effective ways to manage stress in daily life?
​
What are some effective ways to manage stress in daily life?
​
What are some effective ways to manage stress in daily life?
​
What are some effective
​
Top-p 0.5 (Balanced: A mix of safe and creative responses.):
​
What are some effective ways to manage stress in daily life?
​
1. Stay on top of your body's natural stress levels
​
When you're stressed, your body's natural stress levels are low.
​
If you're stressed, your body's natural stress levels are high.
​
If you're stressed, your body's natural stress levels are low.
​
2. Avoid excessive exercise
​
Exercise can make you feel better.
​
Exercise can make you feel better.
​
3. Get up early to avoid fatigue
​
Exercise can make you feel better.
​
Exercise can make you feel better.
​
4. Avoid the temptation to take the wrong thing
​
Exercise can make you feel better.
​
Exercise can make you feel better.
​
5. Avoid eating the wrong foods
​
Exercise can make you feel better.
​
Exercise can make you feel better.
​
6. Avoid the temptation to
​
Top-p 0.7 (Creative: Generates more diverse and creative responses.):
​
What are some effective ways to manage stress in daily life?
​
I am talking about a very specific situation. The person I am talking about has been stressed, but has not been doing much work for a long time. I want to tell you, because this person has had a lot of stress in his life, that it is not something you can just go back to. But what I'm trying to say is, that if you don't have a job, you have to go back to work every day, so you can spend more time with your family. So I've been doing that for a long time now. And so, that is a very common occurrence.
​
But what do you think is the best way to deal with the stress?
​
You know, it's not easy to deal with it. It is very difficult to deal with the stress that we experience. So, that is a very good way to deal with it. So, I think it's the
​
Top-p 0.9 (Very creative: Allows for highly diverse and less probable responses.):
​
What are some effective ways to manage stress in daily life?
​
There are many things that can be done by daily meditation and practice. As a general rule of thumb, meditation can help you stay mindful of your own needs, goals, feelings, desires, emotions, and the life and emotions around you.
​
The purpose of meditation is to feel a deep desire to practice more, to be more mindful, and to be more productive. It also serves to enhance your overall well-being.
​
1. Be active, be creative, be mindful, and be optimistic.
​
This is where the first step towards meditation comes from. If we're looking for inspiration, there's a whole section on being "active" and "creative."
​
While I'm not sure I know much about meditation, I know some of its practitioners and some that I never met. My mom used to tell me that she'd always find a way to make her feel more connected and involved.

次に、出力を理解しましょう。

Top-P 0.1 – 非常に保守的:モデルは最も可能性の高い次の単語の10%から選択します。したがって、生成されたコンテンツには多くの繰り返しがあります。したがって、この応答は多様性を欠くし、おおよそすべての場合は情報量を欠くします。
Top-P 0.3 – консервативный: Модель выбирает из 30% вероятных следующих слов, так что она немного менее консервативна, чем предыдущая настройка Top-P. Как вы можете видеть из вывода, это не улучшило генерацию содержания, и произвольный текст был повторён на протяжении завершения. В этом случае повтор произвольного текста означает, что самой вероятной продолженией после произвольного текста для модели является сам произвольный текст.
Top-P 0.5 – сбалансированный: В этом месте можно observe, что модель enumera несколько нумерованных стратегий впервые. Вы все еще можете найти некоторую повторяемость в этом настройке. Но в конце концов при этом Top-P настройке модель начала внедрять более широкий спектр слов. Вывод – это смесь стандартных советов с некоторыми непоследовательностями. Эта Top-P значение разрешает улучшенную творческую способность, но все еще борюсь с глубиной информации.
Top-P 0.7 – творческий: В этом случае модель может выбирать из более широкого спектра слов, и как вы можете видеть, ответ смещается в сторону narrative style. Содержание более творческое, так как теперь涉及 сценария, где человек сталкивается с стрессом. Drawback – это потеря focus, так как акцент был не на управлении стрессом, а на трудностях в приспосабливании к стрессу.
Top-P 0.9 – очень творческий: В этом режиме модель имеет доступ к широкому спектру слов и идей, включая менее вероятные слова и концепции. Эта настройка позволила модели использовать более выразительный язык. снова drawback очень творческого – модель отклоняется от произвольного текста в поисках производства богатого и различного содержания.

위의 실험から 주목할 수 있는 중요한 것은, Top-P 설정의 변화에 따라 내용이 어떻게 변화하는지 以及 이 PARAMETER이 alone로 사용되어야 하는가 아닌지 이해하는 데 도움이 옵니다.

지금 Top-P의 영향을 보기 위해 썸네일 사용 사례를 몇 가지 더 살펴봐요, 이전 부분과 마찬가지로 “クリエイティブストーリー 생성”과 “기술 설명”에 대한 시리즈의 이전 부분과 마찬가지입니다.

Python

import torch
from transformers import GPT2LMHeadModel, GPT2Tokenizer
​
# Load GPT-2 model and tokenizer
tokenizer = GPT2Tokenizer.from_pretrained("gpt2")
model = GPT2LMHeadModel.from_pretrained("gpt2")
​
# Add pad token to tokenizer (GPT-2 doesn't have it by default)
tokenizer.pad_token = tokenizer.eos_token
​
# Function to generate response based on top_p
def generate_with_top_p(prompt, top_p, max_length=250):
    inputs = tokenizer(prompt, return_tensors='pt')
    outputs = model.generate(
        inputs.input_ids,
        attention_mask=inputs.attention_mask,
        do_sample=True,
        max_length=max_length,
        top_p=top_p,
        pad_token_id=tokenizer.eos_token_id,
        eos_token_id=tokenizer.eos_token_id,
        no_repeat_ngram_size=2  # Prevents repetition of phrases
    )
    return tokenizer.decode(outputs[0], skip_special_tokens=True)
​
### USE CASE 1: CREATIVE STORY GENERATION ###
​
def creative_story_generation():
    prompt = ("In the mystical land of Eldoria, a young warrior found an ancient map "
              "that led to a hidden treasure guarded by dragons. He knew that courage and "
              "wisdom would be his allies on this perilous journey.")
​
    # Negative Impact: Low top_p for creative writing (less creative)
    print("\n=== Creative Story with Low top_p (0.2) - Negative Impact: ===")
    low_top_p_story = generate_with_top_p(prompt, top_p=0.2)
    print(low_top_p_story)
​
    # Perfect Impact: High top_p for creative writing (more creative)
    print("\n=== Creative Story with High top_p (0.95) - Perfect Impact: ===")
    high_top_p_story = generate_with_top_p(prompt, top_p=0.95)
    print(high_top_p_story)
​
### USE CASE 2: TECHNICAL EXPLANATION ###
​
def technical_explanation():
    prompt = ("Explain step by step how the internet works, focusing on how computers "
              "use IP addresses and data packets to communicate with each other.")
​
    # Negative Impact: High top_p for technical writing (less precise)
    print("\n=== Technical Explanation with High top_p (0.95) - Negative Impact: ===")
    high_top_p_explanation = generate_with_top_p(prompt, top_p=0.95)
    print(high_top_p_explanation)
​
    # Perfect Impact: Optimal top_p for technical writing (accurate)
    print("\n=== Technical Explanation with Optimal top_p (0.5) - Perfect Impact: ===")
    optimal_top_p_explanation = generate_with_top_p(prompt, top_p=0.5)
    print(optimal_top_p_explanation)
​
# Run both use cases
creative_story_generation()
technical_explanation()
​

산출물:

PowerShell

python top_p_multiple.py
​
=== Creative Story with Low top_p (0.2) - Negative Impact: ===
In the mystical land of Eldoria, a young warrior found an ancient map that led to a hidden treasure guarded by dragons. He knew that courage and wisdom would be his allies on this perilous journey.
​
The Dragon King
...
 (The Book of the Dragon)
,
-
: The Dragon Lord is a legendary warrior who has been the focus of many legends. The dragon king is the most powerful of all the dragons in the world. In the magical land, he is known as the "Dragon King". He is also known to be the leader of a group of dragons called the Black Dragons. His name is derived from the dragon's name, "the dragon".
"The Black Dragon" is an important symbol of power and powerlessness. It is said that the black dragon is able to create a dragon that can defeat the strongest of his enemies. However, the true power of this dragon lies in his ability to manipulate the minds of others. This ability is called "The Dark Dragon". The Dark dragon has a powerful sense of self-preservation and is capable of manipulating others to his will. When he has control over others, his power is so great that he can destroy entire cities. As a result
​
=== Creative Story with High top_p (0.95) - Perfect Impact: ===
In the mystical land of Eldoria, a young warrior found an ancient map that led to a hidden treasure guarded by dragons. He knew that courage and wisdom would be his allies on this perilous journey.
​
Spirits are like gods. In this world, there are no gods without secrets. There are also no secrets about being a fighter or a thief. But every dragon has a special hidden skill, and he or she can use that skill to destroy and gain strength or hide something hidden in the secret. Many dragons are skilled at their martial arts, while most are unaware of the secrets of their true power. These dragons cannot only use these skills, but that will only allow them to escape the dragons' clutches. Because their training will be tested before they're even born, dragon fighting has never been so hard, even without training, so they should be able to break a dragon's body.
​
=== Technical Explanation with High top_p (0.95) - Negative Impact: ===
Explain step by step how the internet works, focusing on how computers use IP addresses and data packets to communicate with each other. If a person with the same identity as a user on the US government's private network uses the online address bar, then this data is sent to a server on a computer on your local network. Your IP address is a small byte in the string. The IP and network address are identical. Do you remember, you just want to do that instead of using IPs or numbers. In addition, remember that IP can be used to verify a particular IP for you and your computer. For instance, your name does not always match an address on our government network and you should have your public IP in this country. This does seem quite unusual and perhaps a bit bizarre.
​
There was a time in Silicon Valley when you could set your identity out. But in most of today's world, how do you set up your own address and how does one look for it? What about the public? The internet itself was different. It was just a set of rules around data flow that you were supposed to follow. Now, even in today the "internet in general" seems a little more complicated to define. Let's say
​
=== Technical Explanation with Optimal top_p (0.5) - Perfect Impact: ===
Explain step by step how the internet works, focusing on how computers use IP addresses and data packets to communicate with each other.
​
"We've been trying to understand how it works and what it means for the future," says James. "It's not just about the IP address, it's about how people communicate. It's also about what's going on with the data. We want to see how this works. What is the Internet going to look like in the next 10 years?"
, the director of the Computer Science and Artificial Intelligence Laboratory at the University of Michigan, says that while there's still a lot of work to be done, "we've got to start to think about it."

이제 creative story generation과 technical explanation에 대한 산출물을 Top-P 설정에 따라 분석하고 breakdown 해봐요. 산출物에 영향을 미칠 것입니다.

Top-P의 영향을 좋게 보이기 위해 산출물을 이를 것이 잘 나타나는지 시도했습니다.

クリエイティブストーリー 생성

낮은 Top-P (부정적 영향):낮은 Top-P를 보면, 모델이 단어나 phases를 사용하는 것을 제한하므로 반복과 상수가 발생합니다. 이 경우 創造性이 한계를 나누며 새로운 아이디어를 introduce하지 않습니다. 그러나 주목해 봐도, 논리적인 흐름이 still maintained되며 모델이 주제에 adhere하는 것이 낮은 Top-P 값의 유용한 특성입니다.
높은 Top-P (完璧한 영향):이 경우, 모델은 새로운 개념을 introduce하고 이야기 telling에 創造적인 angels를 더합니다. 촉진적인 言語를 사용하여 텍스트에 깊이와 richer를 추가합니다. 그러나 생각적인 흐름이 제한되었습니다.

두 개의 이야기를 비교해서 Top-P의 영향을 明显的하게 보이게 되었습니다. 이는 어떻게 創造的 쓰기를 영향을 줄 수 있는지 이해하기 쉽습니다.

기술 설명

높은 Top-P (부정적 영향):正如你所见, 높은 Top-P가 기술 설명에 부정적인 영향을 미치는 것입니다. 이를 통해 логический 흐름을 방해하고 주제에 벗어나게 되며, 모델은 설명에 pertinent한 irrelevan information을 도입합니다.
적절한 Top-P (完璧한 영향):적절한 Top-P로 설명이 더욱 cohierrent하고 주제에 가까워집니다. 이를 통해 주제와 더욱 맞춰지며, 정확성과 표현을 Balanced하게 관리합니다. 정보의 신뢰성이 높인 것은 모델이 더욱 가능한 단어로 제한되어 있기 때문입니다.

결론

이번 실험으로 我們이 have successfully showcased the importance of the Top-P parameter in controlling the randomness and creativity of the generated text. 우리는 먼저 single prompt에 대해 输出来 varying Top-P에 따라 어떻게 달라지는지 보았고, 그 后面에서 use-case-based approach를 통해 Top-P가 输出来를 어떻게 어 directing하는지 확인하는 것을 试了一遍.

그러나 이번 部分과 이전 部分을 통해 우리는 각각의 파라미터 alone, 输出来 quality를 충분히 만족시키는 것은 아니라는 것을 발견했습니다. 그렇기 때문에 이러한 모든 파라미터의 impact를 보는 것이 중요하며, 이러한 시리즈의 마지막 部分에서 이를 실시하겠습니다.

Source:
https://dzone.com/articles/decoding-llm-parameters-top-p

LLM パラメーター

Top-P (Nucleus Sampling)

クリエイティブ ストーリー 생성

기술 설명

결론

クリエイティブストーリー 생성