解码LLM参数，第二部分：Top-P（核采样）

教程

模型参数

就像任何机器学习模型一样，大型语言模型也有各种参数来控制生成的文本输出的方差。我们已经开始了一个多部分系列，详细解释了这些参数的影响。我们将通过平衡我们多部分系列中讨论的所有这些参数，来完美地生成内容。

欢迎来到第二部分，我们将讨论另一个著名的参数，“Top-P”。

Top-P（核采样）

如果目标是控制模型输出的多样性，那么Top-P就是你的选择。较低的Top-P迫使模型使用最可能的单词，而较高的Top-P则迫使模型使用更多样化的单词，从而提高创造力。

让我们通过以下代码和输出来看看Top-P是如何工作的。

Python

import torch

from transformers import GPT2LMHeadModel, GPT2Tokenizer

​

# Load GPT-2 model and tokenizer

tokenizer = GPT2Tokenizer.from_pretrained("gpt2")

model = GPT2LMHeadModel.from_pretrained("gpt2")

​

# Add pad token to tokenizer (GPT-2 doesn't have it by default)

tokenizer.pad_token = tokenizer.eos_token

​

# Function to generate response with varying top_p

def generate_with_top_p(prompt, top_p):

    inputs = tokenizer(prompt, return_tensors='pt', padding=True)

​

    # Set the attention_mask and pad_token_id

    outputs = model.generate(

        inputs.input_ids,

        attention_mask=inputs['attention_mask'],

        do_sample=True,

        max_length=200, 

        top_p=top_p,

        pad_token_id=tokenizer.eos_token_id

    )

    return tokenizer.decode(outputs[0], skip_special_tokens=True)

​

​

prompt = "What are some effective ways to manage stress in daily life?"

​

# List of top-p values and their descriptions

top_p_values = {

    0.1: "Very conservative: Generates highly probable and safe responses.",

    0.3: "Conservative: Generates probable responses with less risk.",

    0.5: "Balanced: A mix of safe and creative responses.",

    0.7: "Creative: Generates more diverse and creative responses.",

    0.9: "Very creative: Allows for highly diverse and less probable responses."

}

​

# Test top_p variations

for top_p, description in top_p_values.items():

    print(f"\nTop-p {top_p} ({description}):\n")

    print(generate_with_top_p(prompt, top_p=top_p))

​

输出：

PowerShell

python test_top_p.py

​

Top-p 0.1 (Very conservative: Generates highly probable and safe responses.):

​

What are some effective ways to manage stress in daily life?

​

The following are some of the most common ways to manage stress in daily life.

​

1. Avoiding stress

​

The most common way to avoid stress is to avoid it.

​

The most common way to avoid stress is to avoid it.

​

2. Avoiding stress

​

The most common way to avoid stress is to avoid it.

​

The most common way to avoid stress is to avoid it.

​

3. Avoiding stress

​

The most common way to avoid stress is to avoid it.

​

The most common way to avoid stress is to avoid it.

​

4. Avoiding stress

​

The most common way to avoid stress is to avoid it.

​

The most common way to avoid stress is to avoid it.

​

5. Avoiding stress

​

The most common way to avoid stress is to avoid it.

​

The most common way to avoid stress

​

Top-p 0.3 (Conservative: Generates probable responses with less risk.):

​

What are some effective ways to manage stress in daily life?

​

What are some effective ways to manage stress in daily life?

​

What are some effective ways to manage stress in daily life?

​

What are some effective ways to manage stress in daily life?

​

What are some effective ways to manage stress in daily life?

​

What are some effective ways to manage stress in daily life?

​

What are some effective ways to manage stress in daily life?

​

What are some effective ways to manage stress in daily life?

​

What are some effective ways to manage stress in daily life?

​

What are some effective ways to manage stress in daily life?

​

What are some effective ways to manage stress in daily life?

​
What are some effective ways to manage stress in daily life?
​
What are some effective ways to manage stress in daily life?
​
What are some effective ways to manage stress in daily life?
​
What are some effective
​
Top-p 0.5 (Balanced: A mix of safe and creative responses.):
​
What are some effective ways to manage stress in daily life?
​
1. Stay on top of your body's natural stress levels
​
When you're stressed, your body's natural stress levels are low.
​
If you're stressed, your body's natural stress levels are high.
​
If you're stressed, your body's natural stress levels are low.
​
2. Avoid excessive exercise
​
Exercise can make you feel better.
​
Exercise can make you feel better.
​
3. Get up early to avoid fatigue
​
Exercise can make you feel better.
​
Exercise can make you feel better.
​
4. Avoid the temptation to take the wrong thing
​
Exercise can make you feel better.
​
Exercise can make you feel better.
​
5. Avoid eating the wrong foods
​
Exercise can make you feel better.
​
Exercise can make you feel better.
​
6. Avoid the temptation to
​
Top-p 0.7 (Creative: Generates more diverse and creative responses.):
​
What are some effective ways to manage stress in daily life?
​
I am talking about a very specific situation. The person I am talking about has been stressed, but has not been doing much work for a long time. I want to tell you, because this person has had a lot of stress in his life, that it is not something you can just go back to. But what I'm trying to say is, that if you don't have a job, you have to go back to work every day, so you can spend more time with your family. So I've been doing that for a long time now. And so, that is a very common occurrence.
​
But what do you think is the best way to deal with the stress?
​
You know, it's not easy to deal with it. It is very difficult to deal with the stress that we experience. So, that is a very good way to deal with it. So, I think it's the
​
Top-p 0.9 (Very creative: Allows for highly diverse and less probable responses.):
​
What are some effective ways to manage stress in daily life?
​
There are many things that can be done by daily meditation and practice. As a general rule of thumb, meditation can help you stay mindful of your own needs, goals, feelings, desires, emotions, and the life and emotions around you.
​
The purpose of meditation is to feel a deep desire to practice more, to be more mindful, and to be more productive. It also serves to enhance your overall well-being.
​
1. Be active, be creative, be mindful, and be optimistic.
​
This is where the first step towards meditation comes from. If we're looking for inspiration, there's a whole section on being "active" and "creative."
​
While I'm not sure I know much about meditation, I know some of its practitioners and some that I never met. My mom used to tell me that she'd always find a way to make her feel more connected and involved.

现在让我们来理解这个输出。

Top-P 0.1 – 非常保守：由于模型从可能的后续单词选择的前10%中进行选择，因此生成的内容中有大量的重复。因此，这个回应缺乏多样性，而且大多数时候也不够信息丰富。
0.3 的 Top-P 保守设置：该模型从可能的后续词汇中选择前30%，因此它比之前的 Top-P 设置略微不太保守。从输出结果可以看出，这并没有改善内容生成，并且提示在完成过程中被重复使用。在这种情况下，提示的重复意味着模型认为提示本身是最可能的后续内容。
0.5 的 Top-P 平衡设置：在这里，您首次看到模型列出了一些编号的策略。在这个设置中，您仍然会看到一些重复。但关键是，在这个 Top-P 设置下，模型开始融入更多种类的词汇。输出结果是标准建议与一些不一致性的混合。这个 Top-P 值允许提高创造性，但仍然难以提供深入的信息。
0.7 的 Top-P 创造设置：在这种情况下，模型可以筛选出更多种类的词汇，正如您所看到的，回应正朝着叙事风格转变。由于情景设定中涉及到一个人如何应对压力，因此内容更具创造性。不过，缺点是失去了焦点，重点不是管理压力，而是应对压力的困难。
0.9 的 Top-P 非常创造设置：在这个设置中，模型可以访问大量的词汇和想法，包括不太可能的词汇和概念。这个设置让模型能够使用更具表现力的语言。然而，非常创造的缺点是，模型在追求产生丰富多样的内容时，可能会偏离提示。

上述练习的关键点是，随着Top-P设置的变化，内容如何发生变化。它还让我们了解到，这个参数并不是唯一需要处理内容变化和相关性的参数。

现在，让我们看看Top-P对“创意故事生成”和“技术解释”这两个用例的影响，正如本系列文章的前一部分所讲。

Python

import torch
from transformers import GPT2LMHeadModel, GPT2Tokenizer
​
# Load GPT-2 model and tokenizer
tokenizer = GPT2Tokenizer.from_pretrained("gpt2")
model = GPT2LMHeadModel.from_pretrained("gpt2")
​
# Add pad token to tokenizer (GPT-2 doesn't have it by default)
tokenizer.pad_token = tokenizer.eos_token
​
# Function to generate response based on top_p
def generate_with_top_p(prompt, top_p, max_length=250):
    inputs = tokenizer(prompt, return_tensors='pt')
    outputs = model.generate(
        inputs.input_ids,
        attention_mask=inputs.attention_mask,
        do_sample=True,
        max_length=max_length,
        top_p=top_p,
        pad_token_id=tokenizer.eos_token_id,
        eos_token_id=tokenizer.eos_token_id,
        no_repeat_ngram_size=2  # Prevents repetition of phrases
    )
    return tokenizer.decode(outputs[0], skip_special_tokens=True)
​
### USE CASE 1: CREATIVE STORY GENERATION ###
​
def creative_story_generation():
    prompt = ("In the mystical land of Eldoria, a young warrior found an ancient map "
              "that led to a hidden treasure guarded by dragons. He knew that courage and "
              "wisdom would be his allies on this perilous journey.")
​
    # Negative Impact: Low top_p for creative writing (less creative)
    print("\n=== Creative Story with Low top_p (0.2) - Negative Impact: ===")
    low_top_p_story = generate_with_top_p(prompt, top_p=0.2)
    print(low_top_p_story)
​
    # Perfect Impact: High top_p for creative writing (more creative)
    print("\n=== Creative Story with High top_p (0.95) - Perfect Impact: ===")
    high_top_p_story = generate_with_top_p(prompt, top_p=0.95)
    print(high_top_p_story)
​
### USE CASE 2: TECHNICAL EXPLANATION ###
​
def technical_explanation():
    prompt = ("Explain step by step how the internet works, focusing on how computers "
              "use IP addresses and data packets to communicate with each other.")
​
    # Negative Impact: High top_p for technical writing (less precise)
    print("\n=== Technical Explanation with High top_p (0.95) - Negative Impact: ===")
    high_top_p_explanation = generate_with_top_p(prompt, top_p=0.95)
    print(high_top_p_explanation)
​
    # Perfect Impact: Optimal top_p for technical writing (accurate)
    print("\n=== Technical Explanation with Optimal top_p (0.5) - Perfect Impact: ===")
    optimal_top_p_explanation = generate_with_top_p(prompt, top_p=0.5)
    print(optimal_top_p_explanation)
​
# Run both use cases
creative_story_generation()
technical_explanation()
​

输出：

PowerShell

python top_p_multiple.py
​
=== Creative Story with Low top_p (0.2) - Negative Impact: ===
In the mystical land of Eldoria, a young warrior found an ancient map that led to a hidden treasure guarded by dragons. He knew that courage and wisdom would be his allies on this perilous journey.
​
The Dragon King
...
 (The Book of the Dragon)
,
-
: The Dragon Lord is a legendary warrior who has been the focus of many legends. The dragon king is the most powerful of all the dragons in the world. In the magical land, he is known as the "Dragon King". He is also known to be the leader of a group of dragons called the Black Dragons. His name is derived from the dragon's name, "the dragon".
"The Black Dragon" is an important symbol of power and powerlessness. It is said that the black dragon is able to create a dragon that can defeat the strongest of his enemies. However, the true power of this dragon lies in his ability to manipulate the minds of others. This ability is called "The Dark Dragon". The Dark dragon has a powerful sense of self-preservation and is capable of manipulating others to his will. When he has control over others, his power is so great that he can destroy entire cities. As a result
​
=== Creative Story with High top_p (0.95) - Perfect Impact: ===
In the mystical land of Eldoria, a young warrior found an ancient map that led to a hidden treasure guarded by dragons. He knew that courage and wisdom would be his allies on this perilous journey.
​
Spirits are like gods. In this world, there are no gods without secrets. There are also no secrets about being a fighter or a thief. But every dragon has a special hidden skill, and he or she can use that skill to destroy and gain strength or hide something hidden in the secret. Many dragons are skilled at their martial arts, while most are unaware of the secrets of their true power. These dragons cannot only use these skills, but that will only allow them to escape the dragons' clutches. Because their training will be tested before they're even born, dragon fighting has never been so hard, even without training, so they should be able to break a dragon's body.
​
=== Technical Explanation with High top_p (0.95) - Negative Impact: ===
Explain step by step how the internet works, focusing on how computers use IP addresses and data packets to communicate with each other. If a person with the same identity as a user on the US government's private network uses the online address bar, then this data is sent to a server on a computer on your local network. Your IP address is a small byte in the string. The IP and network address are identical. Do you remember, you just want to do that instead of using IPs or numbers. In addition, remember that IP can be used to verify a particular IP for you and your computer. For instance, your name does not always match an address on our government network and you should have your public IP in this country. This does seem quite unusual and perhaps a bit bizarre.
​
There was a time in Silicon Valley when you could set your identity out. But in most of today's world, how do you set up your own address and how does one look for it? What about the public? The internet itself was different. It was just a set of rules around data flow that you were supposed to follow. Now, even in today the "internet in general" seems a little more complicated to define. Let's say
​
=== Technical Explanation with Optimal top_p (0.5) - Perfect Impact: ===
Explain step by step how the internet works, focusing on how computers use IP addresses and data packets to communicate with each other.
​
"We've been trying to understand how it works and what it means for the future," says James. "It's not just about the IP address, it's about how people communicate. It's also about what's going on with the data. We want to see how this works. What is the Internet going to look like in the next 10 years?"
, the director of the Computer Science and Artificial Intelligence Laboratory at the University of Michigan, says that while there's still a lot of work to be done, "we've got to start to think about it."

现在，让我们分解并分析基于Top-P设置的创意故事生成和技术解释的输出，以及输出是如何受到影响的。

为了有效演示Top-P的影响，我们引入了更好的提示，以引导输出，使影响容易观察到。

创意故事生成

低Top-P（负面影响）：正如您所看到的，较低的Top-P使得模型局限于使用单词或短语，从而导致重复和冗余。在这种情况下，创造性也受到限制，因为模型试图不引入新想法。但是，如果您注意，逻辑流程仍然得到维护，模型仍然紧扣主题，这是低Top-P值的特点。
高Top-P（完美影响）：在这种情况下，模型引入了新概念，并为叙述增添了创意角度。使用了更广泛的词汇，使文本更加深入和丰富。然而，由于创造性的提高，逻辑流程受到了抑制。

两个叙述之间的对比清楚地显示了Top-P的影响，使人们容易理解它如何影响创意写作。

技术解释

高Top-P（负面影响）：如您所见，高Top-P对技术解释产生了负面影响，它阻止了逻辑流畅，并使内容偏离了主题。模型还引入了与解释无关的信息，这些信息与解释不相关。
最优Top-P（完美影响）：使用最优Top-P时，解释更加连贯，内容更接近主题。输出与提示更加一致，在准确性和表达之间取得了很好的平衡。由于模型限制在更可能的单词中，因此信息的可靠性得到了增强。

结论

通过这个实验，我们成功地展示了Top-P参数在控制生成文本的随机性和创造性方面的重要性。我们首先研究了一个提示，并观察了Top-P变化时输出的变化，然后采用了基于用例的方法，研究了Top-P如何根据用例控制输出。

然而，从前一部分和本部分来看，我们发现单独来看，每个参数对内容生成的质量来说是不够的。因此，研究所有这些参数的影响至关重要，这将是本系列的最后一部分。

Source:
https://dzone.com/articles/decoding-llm-parameters-top-p