Descifrar los Parámetros de LLM, Parte 1: Temperatura

Tutoriales

Parámetros de MLM

Como cualquier modelo de aprendizaje automático, los modelos de gran escala de lenguaje tienen varios parámetros que controlan la varianza de la salida de texto generada. Hemos iniciado una serie multíparte para explicar detalladamente el impacto de estos parámetros. Terminaremos alcanzando un equilibrio perfecto en la generación de contenido utilizando todos estos parámetros discutidos en nuestra serie multíparte.

Bienvenido al primer parte, donde discutimos el parámetro más conocido, la “Temperatura”.

Temperatura

Si el objetivo es controlar la aleatoriedad de las predicciones, entonces la temperatura es lo que buscas. Valores de temperatura bajos hacen que la salida sea más determinista, mientras que valores altos permitirán resultados diversos, haciendo que sea más creativo.

Veamos la temperatura en acción utilizando el siguiente código y salida. Para presentar la importancia de manera sencilla, hemos elegido utilizar los transformers de hugging face y el modelo GPT2 en particular.

Python

import torch
from transformers import GPT2LMHeadModel, GPT2Tokenizer
​
# Load GPT-2 model and tokenizer
tokenizer = GPT2Tokenizer.from_pretrained("gpt2")
model = GPT2LMHeadModel.from_pretrained("gpt2")
​
# Add pad token to tokenizer (GPT-2 doesn't have it by default)
tokenizer.pad_token = tokenizer.eos_token
​
# Function to generate response with varying temperature
def generate_with_temperature(prompt, temperature):
    inputs = tokenizer(prompt, return_tensors='pt', padding=True)
​
    # Set the attention_mask and pad_token_id
    outputs = model.generate(
        inputs.input_ids,
        attention_mask=inputs['attention_mask'], 
        do_sample=True,
        max_length=200,
        temperature=temperature,
        pad_token_id=tokenizer.eos_token_id  
    )
    return tokenizer.decode(outputs[0], skip_special_tokens=True)
​
#Prompt on which we get the content generated based on temperature
prompt = "What are some of the best strategies within NFL Fantasy"
​
# Test temperature variations
print("Temperature 0.3 (Less deterministic but coherent):\n", generate_with_temperature(prompt, temperature=0.3))
print("\nTemperature 0.5 (Balanced):\n", generate_with_temperature(prompt, temperature=0.5))
print("\nTemperature 0.9 (More creative):\n", generate_with_temperature(prompt, temperature=0.9))
​

Salida:

PowerShell

python test_temperature.py
Temperature 0.3 (Less deterministic but coherent):
 What are some of the best strategies within NFL Fantasy Football?
​
I've seen a lot of great strategies for predicting the next NFL season. I've seen great strategies for predicting the next NFL season.
​
I've seen a lot of great strategies for predicting the next NFL season. I've seen great strategies for predicting the next NFL season.
​
I've seen a lot of great strategies for predicting the next NFL season. I've seen great strategies for predicting the next NFL season.
​
I've seen a lot of great strategies for predicting the next NFL season. I've seen great strategies for predicting the next NFL season.
​
I've seen a lot of great strategies for predicting the next NFL season. I've seen great strategies for predicting the next NFL season.
​
I've seen a lot of great strategies for predicting the next NFL season. I've seen great strategies for predicting the next NFL season.
​
I've seen a lot of great strategies for predicting the next
​
Temperature 0.5 (Balanced):
 What are some of the best strategies within NFL Fantasy Football?
​
1. Pick up the first pick and use it to draft a QB
​
In the past, I've mentioned that I tend to pick up the first pick in the first round. The best way to pick up the first pick is to take the first pick and then use it to draft a QB. I find that the more I draft the QB, the more I draft the QB. This is why I like to draft the first pick in the first round.
​
2. Draft a QB for the first time
​
This is an important one, but it's not the only one. Drafting a QB for the first time is a great way to start your fantasy season.
​
3. Draft a QB for the first time
​
This is the most important one. Drafting a QB for the first time is the most important one. Drafting a QB for the first time is the most important one.
​
Temperature 0.9 (More creative):
 What are some of the best strategies within NFL Fantasy?
​
If you are looking for players that will be good for you, here is an updated list of key stats, which you can find on our official website:
​
All players were ranked in the top 10 fantasy players. These players are all high-rated defensive backs or running backs with good play across all phases of their careers. The players above were ranked from 5-5 for total points scored.
​
The chart below will allow you to visualize the players in your league.
​
All players have 5.5 sacks, 5 sacks and 2.5 tackles for loss on the season. They have a combined 11.3 sacks with a 4.6, 1.6 and 2.1 yards per carry average, respectively.
​
Each player has three touchdowns. The three touchdowns are tied for the top five fantasy points with 3 points in an entire game. The three touchdowns are tied for the top ten points with 2 points

Veamos la salida:

Temperatura Baja (0.3):El modelo se centrará en las opciones de palabra más probables. Si la precisión y la consistencia son importantes para ti, entonces considera la temperatura en este rango. Sin embargo, tenga en mente que el modelo podría quedarse atascado repitiendo frases similares, como es el caso con nuestra salida aquí.
Temperatura Media (0.5): Esta temperatura equilibra perfectamente coherencia y creatividad. Es un excelente terreno intermedio si quieres una cantidad razonable de variación sin perder estructura. Como puede ver en la salida, se ha añadido un poco de equilibrio, sin embargo, aún puede observar alguna repetición en la salida.
Alta Temperatura (0.9): Esta temperatura hace explotar la MLL para que sea tan creativa como sea posible. Como puede ver, esta salida difiere de las dos anteriores, introduciendo mucha aleatoriedad y variación en el contenido.

El ejemplo anterior establece un entendimiento fundamental de la temperatura. Vamos a mirarlo de forma un poco más detallada con un par de casos de uso: “Generación de Historias Creativas” y “Explicación Técnica”.

Veamos esto con el siguiente código para comprender cómo la temperatura afecta los dos casos de uso anteriores.

Python

​x
import torch
from transformers import GPT2LMHeadModel, GPT2Tokenizer
​
# Load GPT-2 model and tokenizer
tokenizer = GPT2Tokenizer.from_pretrained("gpt2")
model = GPT2LMHeadModel.from_pretrained("gpt2")
​
# Add pad token to tokenizer (GPT-2 doesn't have it by default)
tokenizer.pad_token = tokenizer.eos_token
​
# Function to generate response based on temperature
def generate_with_temperature(prompt, temperature, max_length=200):
    inputs = tokenizer(prompt, return_tensors='pt', padding=True)
    outputs = model.generate(
        inputs.input_ids,
        attention_mask=inputs['attention_mask'],
        do_sample=True,
        max_length=max_length,
        temperature=temperature,  # Only focusing on temperature
        pad_token_id=tokenizer.eos_token_id
    )
    return tokenizer.decode(outputs[0], skip_special_tokens=True)
​
### USE CASE 1: CREATIVE STORY GENERATION ###
​
def creative_story_generation():
    prompt = "Once upon a time, in a distant galaxy, there was a spaceship called Voyager."
​
    # Negative Impact: Low temperature for creative writing (too deterministic, repetitive)
    print("\n=== Creative Story with Low Temperature (0.2) - Negative Impact: ===")
    low_temp_story = generate_with_temperature(prompt, temperature=0.2)
    print(low_temp_story)
​
    # Perfect Impact: High temperature for creative writing (more creative and varied)
    print("\n=== Creative Story with High Temperature (0.9) - Perfect Impact: ===")
    high_temp_story = generate_with_temperature(prompt, temperature=0.9)
    print(high_temp_story)
​
### USE CASE 2: TECHNICAL EXPLANATION ###
​
def technical_explanation():
    prompt = "Explain how blockchain works in simple terms."
​
    # Negative Impact: High temperature for technical writing (too creative, inaccurate)
    print("\n=== Technical Explanation with High Temperature (0.9) - Negative Impact: ===")
    high_temp_explanation = generate_with_temperature(prompt, temperature=0.9)
    print(high_temp_explanation)
​
    # Perfect Impact: Optimal temperature for technical writing (accurate and focused)
    print("\n=== Technical Explanation with Adjusted Temperature (0.7) - Perfect Impact: ===")
    perfect_temp_explanation = generate_with_temperature(prompt, temperature=0.7)
    print(perfect_temp_explanation)
​
# Run both use cases
creative_story_generation()
technical_explanation()
​

Salida:

PowerShell

python temperature_impact.py
​
=== Creative Story with Low Temperature (0.2) - Negative Impact: ===
Once upon a time, in a distant galaxy, there was a spaceship called Voyager. It was a spaceship that had been brought back from the dead by the gods. It was a spaceship that had been brought back from the dead by the gods. It was a spaceship that had been brought back from the dead by the gods. It was a spaceship that had been brought back from the dead by the gods. It was a spaceship that had been brought back from the dead by the gods. It was a spaceship that had been brought back from the dead by the gods. It was a spaceship that had been brought back from the dead by the gods. It was a spaceship that had been brought back from the dead by the gods. It was a spaceship that had been brought back from the dead by the gods. It was a spaceship that had been brought back from the dead by the gods. It was a spaceship that had been brought back from the dead by the gods. It was a spaceship that had been
​
=== Creative Story with High Temperature (0.9) - Perfect Impact: ===
Once upon a time, in a distant galaxy, there was a spaceship called Voyager. The ship seemed to have been flying in space as well, like the spaceship in the movie "The Voyage Home". The Captain of the Voyager was Captain Jean-Luc Picard.
​
In the Star Trek movies, this is true. But in the movie, on Voyager, our hero is not Jean-Luc Picard. Jean-Luc, the villain, has no desire to lead us to vengeance against the dying star.
​
But what about Star Trek VI: The Undiscovered Country…
​
In a scene that's been in development for years (one you'll almost certainly see in an upcoming Star Trek film), in the middle of the movie (one you won't see until later in the film), we see that Picard is no longer in the Star Trek universe as is the story, to be played by the same actor who played Lieutenant Dax (who was in the movie), but Picard himself.
​
=== Technical Explanation with High Temperature (0.9) - Negative Impact: ===
Explain how blockchain works in simple terms.
​
Blockchain can be used to determine if the system is trustworthy and to prevent fraud, even if the system is used in a completely different manner. Blockchain can also be used to help determine how the system is run and to ensure that its operation is efficient.
​
This way all your trust in the system can be verified by your actions, and you can have full control over it. When you are not trusting a computer, it can be easy to get a hold of a server and then just change the software, allowing you to control and monitor transactions with the help of the blockchain. If your business uses distributed storage then it is easy to have more control over your activities.
​
What do I need to understand about blockchain?
​
To understand how blockchain works and how you can use it properly, you must first understand how blockchain works.
​
Bitcoins are digital tokens, created at the start of each generation that are used to
​
=== Technical Explanation with Adjusted Temperature (0.7) - Perfect Impact: ===
Explain how blockchain works in simple terms.
​
What are the key differences between Bitcoin and Ethereum?
​
Blockchain is a cryptographic protocol. It can be used to create any type of transaction. It is used to store data and create new entities. It is used as a system of communication in blockchain systems.
​
In Ethereum, the transaction is recorded, stored, and used to perform the transaction. It is a way to transfer information. The transaction is called a "blockchain."
​
Since the blockchain is used for many things, it is easy to understand how the technology works. The most important difference is that Ethereum uses the blockchain to create an interface to the Internet of Things. It is this interface that allows for data exchange and the creation of new entities.
​
Because of this, it is possible to perform the transactions on the blockchain. So, what is the difference between Bitcoin and Ethereum?
​
The Bitcoin and Ethereum blockchain is a distributed ledger.

Ahora vamos a detener y analizar la salida para la generación de historias creativas y la explicación técnica en función de los ajustes de temperatura y cómo la salida fue afectada. Además, observaremos cómo un ajuste de temperatura funciona perfectamente para un caso de uso y lo contrario para otro caso de uso.

Generación de Historias Creativas

Temperatura Baja (Impacto Negativo): Como puede ver, la salida de la historia es muyrepetitiva y carece de variedad. Este resultado no es satisfactorio para una tarea creativa y larepetitividad extrema causada por la incapacidad de la modelo para introducir ideas novedosas e innovadoras hace que sea deseable para la narración de historias.
Alta Temperatura (Impacto Perfecto): Como puede ver从 la salida, la historia toma direcciones interesantes y es muy creativa. La salida también agrega múltiples aspectos a la historia, lo que la hace variada, imaginativa y perfecta para una narrativa innovadora.

Explicación Técnica

Alta Temperatura (Impacto Negativo): Es importante recordar que mantener la precisión factual es muy importante para un caso de uso como una explicación técnica. La alta temperatura conduce a mucha aleatoriedad y a palabras menos probables que se introducen en el contenido generado, lo que lo hace insatisfactorio para la escritura técnica. Lo mismo se puede inferir de la salida anterior que es demasiado vaga e incluye ideas irrelevantes.
Temperatura Ajustada (Impacto Perfecto): Hemos ajustado la temperatura a un ajuste que consigue un equilibrio perfecto para la generación de contenido técnico. Como puede ver, la salida es mucho más organizada ahora. En este ajuste de temperatura, el modelo evita la repetición como lo hace a temperaturas bajas y no pierde coherencia como ocurre a temperaturas altas.

Conclusión

Ha visto todas las formas en que la temperatura puede afectar la generación de contenido y qué ajuste de temperatura es perfecto para cada caso de uso. Además, note que ajustar la temperatura no es lo único para la generación de contenido; también tendrá que ajustar otros parámetros. Vamos a ver todo eso en los próximos artículos de la serie.

Source:
https://dzone.com/articles/decoding-llm-parameters-temperature