LLM パラ미터를 해석하는 것 1: 温暖度

LLM 파라미터

모든 머신 러닝 모델과 마찬가지로, 대형 언어 모델은 생성된 텍스트 출력의 분산을 제어하는 여러 가지 파라미터를 가지고 있습니다. 우리는 이러한 파라미터들의 영향을 자세히 설명하는 다부분 시리즈를 시작했습니다. 이러한 파라미터들을 다루는 우리의 다부분 시리즈를 통해 콘텐츠 생성에서 완벽한 균형을 찾는 데 이어질 것입니다.

첫 부분으로, 가장 잘 알려진 파라미터 “온도(Temperature)”을 다루는 곳으로 초대합니다.

온도

예측의 무작위성을 제어하려면 온도가 딱이 됩니다. 낮은 온도 값은 출력을 더Deterministic하게 만들고, 높은 값은 다양한 결과를 허용하여 더 크리에이티브하게 만듭니다.

다음 코드와 출력을 통해 온도의 작동을 살펴보겠습니다. 중요성을 간단하게 제시하기 위해, huging face transformers 그리고 특히 GPT2 모델을 사용하였습니다.

Python

import torch
from transformers import GPT2LMHeadModel, GPT2Tokenizer
​
# Load GPT-2 model and tokenizer
tokenizer = GPT2Tokenizer.from_pretrained("gpt2")
model = GPT2LMHeadModel.from_pretrained("gpt2")
​
# Add pad token to tokenizer (GPT-2 doesn't have it by default)
tokenizer.pad_token = tokenizer.eos_token
​
# Function to generate response with varying temperature
def generate_with_temperature(prompt, temperature):
    inputs = tokenizer(prompt, return_tensors='pt', padding=True)
​
    # Set the attention_mask and pad_token_id
    outputs = model.generate(
        inputs.input_ids,
        attention_mask=inputs['attention_mask'], 
        do_sample=True,
        max_length=200,
        temperature=temperature,
        pad_token_id=tokenizer.eos_token_id  
    )
    return tokenizer.decode(outputs[0], skip_special_tokens=True)
​
#Prompt on which we get the content generated based on temperature
prompt = "What are some of the best strategies within NFL Fantasy"
​
# Test temperature variations
print("Temperature 0.3 (Less deterministic but coherent):\n", generate_with_temperature(prompt, temperature=0.3))
print("\nTemperature 0.5 (Balanced):\n", generate_with_temperature(prompt, temperature=0.5))
print("\nTemperature 0.9 (More creative):\n", generate_with_temperature(prompt, temperature=0.9))
​

출력:

PowerShell

python test_temperature.py
Temperature 0.3 (Less deterministic but coherent):
 What are some of the best strategies within NFL Fantasy Football?
​
I've seen a lot of great strategies for predicting the next NFL season. I've seen great strategies for predicting the next NFL season.
​
I've seen a lot of great strategies for predicting the next NFL season. I've seen great strategies for predicting the next NFL season.
​
I've seen a lot of great strategies for predicting the next NFL season. I've seen great strategies for predicting the next NFL season.
​
I've seen a lot of great strategies for predicting the next NFL season. I've seen great strategies for predicting the next NFL season.
​
I've seen a lot of great strategies for predicting the next NFL season. I've seen great strategies for predicting the next NFL season.
​
I've seen a lot of great strategies for predicting the next NFL season. I've seen great strategies for predicting the next NFL season.
​
I've seen a lot of great strategies for predicting the next
​
Temperature 0.5 (Balanced):
 What are some of the best strategies within NFL Fantasy Football?
​
1. Pick up the first pick and use it to draft a QB
​
In the past, I've mentioned that I tend to pick up the first pick in the first round. The best way to pick up the first pick is to take the first pick and then use it to draft a QB. I find that the more I draft the QB, the more I draft the QB. This is why I like to draft the first pick in the first round.
​
2. Draft a QB for the first time
​
This is an important one, but it's not the only one. Drafting a QB for the first time is a great way to start your fantasy season.
​
3. Draft a QB for the first time
​
This is the most important one. Drafting a QB for the first time is the most important one. Drafting a QB for the first time is the most important one.
​
Temperature 0.9 (More creative):
 What are some of the best strategies within NFL Fantasy?
​
If you are looking for players that will be good for you, here is an updated list of key stats, which you can find on our official website:
​
All players were ranked in the top 10 fantasy players. These players are all high-rated defensive backs or running backs with good play across all phases of their careers. The players above were ranked from 5-5 for total points scored.
​
The chart below will allow you to visualize the players in your league.
​
All players have 5.5 sacks, 5 sacks and 2.5 tackles for loss on the season. They have a combined 11.3 sacks with a 4.6, 1.6 and 2.1 yards per carry average, respectively.
​
Each player has three touchdowns. The three touchdowns are tied for the top five fantasy points with 3 points in an entire game. The three touchdowns are tied for the top ten points with 2 points

출력을 이해해봅시다:

저온도 (0.3): 모델은 가장 가능성이 높은 단어 선택에 초점을 맞출 것입니다. 정밀도와 일관성이 중요하면 이 온도 범위를 사용하시기 바랍니다. 그러나 모델이 우리의 출력과 같이 비슷한 문구를 반복할 수도 있음을 유의해야 합니다.
중간 온도 (0.5): 이 온도는 일관성과 独创性을 완벽하게 Balances. Structured-based variety 을 얻고자 하면 이 중앙 경계는 좋은 방안입니다. 출력에서 봐도 조금의 balancer 가 추가되었지만, 출력에서 결코 반복이 보입니다.
고 온도 (0.9): 이 온도는 LLM-을 가장 独创적으로 만들어 내기 위해 폭발시킵니다. 이 출력은 이전 두 것과 다르게 나와, 내용에 많은 임의성과 변异性를 가져带来了합니다.

이전 예제는 온도에 대한 기본적인 이해를 제공합니다. 이제 “Creative Story Generation”과 “Technical Explanation”-로 더 자세히 살펴봅시다.

이러한 것을 다음 코드로 이해하여 온도가 이 두 사용 사례에 어떻게 영향을 미치는지 알아봅시다.

Python

​x
import torch
from transformers import GPT2LMHeadModel, GPT2Tokenizer
​
# Load GPT-2 model and tokenizer
tokenizer = GPT2Tokenizer.from_pretrained("gpt2")
model = GPT2LMHeadModel.from_pretrained("gpt2")
​
# Add pad token to tokenizer (GPT-2 doesn't have it by default)
tokenizer.pad_token = tokenizer.eos_token
​
# Function to generate response based on temperature
def generate_with_temperature(prompt, temperature, max_length=200):
    inputs = tokenizer(prompt, return_tensors='pt', padding=True)
    outputs = model.generate(
        inputs.input_ids,
        attention_mask=inputs['attention_mask'],
        do_sample=True,
        max_length=max_length,
        temperature=temperature,  # Only focusing on temperature
        pad_token_id=tokenizer.eos_token_id
    )
    return tokenizer.decode(outputs[0], skip_special_tokens=True)
​
### USE CASE 1: CREATIVE STORY GENERATION ###
​
def creative_story_generation():
    prompt = "Once upon a time, in a distant galaxy, there was a spaceship called Voyager."
​
    # Negative Impact: Low temperature for creative writing (too deterministic, repetitive)
    print("\n=== Creative Story with Low Temperature (0.2) - Negative Impact: ===")
    low_temp_story = generate_with_temperature(prompt, temperature=0.2)
    print(low_temp_story)
​
    # Perfect Impact: High temperature for creative writing (more creative and varied)
    print("\n=== Creative Story with High Temperature (0.9) - Perfect Impact: ===")
    high_temp_story = generate_with_temperature(prompt, temperature=0.9)
    print(high_temp_story)
​
### USE CASE 2: TECHNICAL EXPLANATION ###
​
def technical_explanation():
    prompt = "Explain how blockchain works in simple terms."
​
    # Negative Impact: High temperature for technical writing (too creative, inaccurate)
    print("\n=== Technical Explanation with High Temperature (0.9) - Negative Impact: ===")
    high_temp_explanation = generate_with_temperature(prompt, temperature=0.9)
    print(high_temp_explanation)
​
    # Perfect Impact: Optimal temperature for technical writing (accurate and focused)
    print("\n=== Technical Explanation with Adjusted Temperature (0.7) - Perfect Impact: ===")
    perfect_temp_explanation = generate_with_temperature(prompt, temperature=0.7)
    print(perfect_temp_explanation)
​
# Run both use cases
creative_story_generation()
technical_explanation()
​

Output:

PowerShell

python temperature_impact.py
​
=== Creative Story with Low Temperature (0.2) - Negative Impact: ===
Once upon a time, in a distant galaxy, there was a spaceship called Voyager. It was a spaceship that had been brought back from the dead by the gods. It was a spaceship that had been brought back from the dead by the gods. It was a spaceship that had been brought back from the dead by the gods. It was a spaceship that had been brought back from the dead by the gods. It was a spaceship that had been brought back from the dead by the gods. It was a spaceship that had been brought back from the dead by the gods. It was a spaceship that had been brought back from the dead by the gods. It was a spaceship that had been brought back from the dead by the gods. It was a spaceship that had been brought back from the dead by the gods. It was a spaceship that had been brought back from the dead by the gods. It was a spaceship that had been brought back from the dead by the gods. It was a spaceship that had been
​
=== Creative Story with High Temperature (0.9) - Perfect Impact: ===
Once upon a time, in a distant galaxy, there was a spaceship called Voyager. The ship seemed to have been flying in space as well, like the spaceship in the movie "The Voyage Home". The Captain of the Voyager was Captain Jean-Luc Picard.
​
In the Star Trek movies, this is true. But in the movie, on Voyager, our hero is not Jean-Luc Picard. Jean-Luc, the villain, has no desire to lead us to vengeance against the dying star.
​
But what about Star Trek VI: The Undiscovered Country…
​
In a scene that's been in development for years (one you'll almost certainly see in an upcoming Star Trek film), in the middle of the movie (one you won't see until later in the film), we see that Picard is no longer in the Star Trek universe as is the story, to be played by the same actor who played Lieutenant Dax (who was in the movie), but Picard himself.
​
=== Technical Explanation with High Temperature (0.9) - Negative Impact: ===
Explain how blockchain works in simple terms.
​
Blockchain can be used to determine if the system is trustworthy and to prevent fraud, even if the system is used in a completely different manner. Blockchain can also be used to help determine how the system is run and to ensure that its operation is efficient.
​
This way all your trust in the system can be verified by your actions, and you can have full control over it. When you are not trusting a computer, it can be easy to get a hold of a server and then just change the software, allowing you to control and monitor transactions with the help of the blockchain. If your business uses distributed storage then it is easy to have more control over your activities.
​
What do I need to understand about blockchain?
​
To understand how blockchain works and how you can use it properly, you must first understand how blockchain works.
​
Bitcoins are digital tokens, created at the start of each generation that are used to
​
=== Technical Explanation with Adjusted Temperature (0.7) - Perfect Impact: ===
Explain how blockchain works in simple terms.
​
What are the key differences between Bitcoin and Ethereum?
​
Blockchain is a cryptographic protocol. It can be used to create any type of transaction. It is used to store data and create new entities. It is used as a system of communication in blockchain systems.
​
In Ethereum, the transaction is recorded, stored, and used to perform the transaction. It is a way to transfer information. The transaction is called a "blockchain."
​
Since the blockchain is used for many things, it is easy to understand how the technology works. The most important difference is that Ethereum uses the blockchain to create an interface to the Internet of Things. It is this interface that allows for data exchange and the creation of new entities.
​
Because of this, it is possible to perform the transactions on the blockchain. So, what is the difference between Bitcoin and Ethereum?
​
The Bitcoin and Ethereum blockchain is a distributed ledger.

지금 출력을 분석하여 独创적인 이야기 생성과 기술적인 설명에 따라 온도 세팅에 따라 어떻게 영향을 받았는지 분석하고, 어느 사용 사례에 대해 온도 세팅이 완벽하게 작동하는지, 다른 사용 사례에 대해 완전히 반대的인 작동을 보여주는지 관찰합니다.

Creative Story Generation

Low Temperature (Negative Impact): As you can see, the story output is highly repetitive and lacks variety. This result is not satisfying for a creative task, and the extreme repetitiveness caused by the model’s inability to introduce novel and innovative ideas makes it undesirable for storytelling.
고 온도 (完璧な影響): 출력 결과를 보시다가, 이 이야기가 有趣的한 方向으로 발전하고 매우 CREATIVE 합니다. 출력 결과는 이 이야기에 다양한 면면을 더해놓기 때문에, 다양하고 상상력 가득하며, 革新적인 이야기 telliNG에 완벽하게 적용되ます.

기술적 설명

고 온도 (부정적인 영향): 사실 정확성을 유지하는 것이 기술적 설명 같은 용도에서 매우 중요합니다. 고 온도는 자주 случай성을 많이 가져가고, 생성된 コンテンツ에서 어려운 단어를 introduce하게 해줍니다, 기술적 쓰기에서는 불만족스러울 수 있습니다. 위에 보이는 출력 이래도 너무 vagUE하며 irrelevant ideas를 포함하고 있음을 이해할 수 있습니다.
조정된 온도 (完璧한 영향): 우리는 기술적 コンテン츠를 생성하는 것에 적절한 조정된 온도를 찾았습니다. 보시면, 현재 출력은 훨씬 더 sorted right now입니다. 이 온도 조정은 낮은 온도와 같이 반복되지 않고, 높은 온도와 같이 coherence를 잃지 않습니다.

결론

온도가 コンテン츠 생성에 영향을 끼칠 수 있는 모든 것과, 어떤 이 temperature setting은 어떤 용도에서 perfect해야 하는지 알 수 있습니다. 또한, temperature를 조정하는 것이 모든 것의 beginning and end-all인지 note하십시오; 다른 PARAMETER도 조금 조정해야 합니다. 이러한 시리즈의 다음 게시물에서 모두 보게 될 것입니다.

Source:
https://dzone.com/articles/decoding-llm-parameters-temperature