Fine-Tuning of Large Language Models

week 7 video 5 n.w
1 / 13
Embed
Share

Learn about fine-tuning large language models to better fit specific datasets by modifying the neural network's weights. Discover the benefits, use cases, and considerations associated with fine-tuning, including overcoming deprecation issues and achieving more efficient prompt processing.

  • Language Models
  • Fine-Tuning
  • Neural Network
  • Prompt Processing
  • Use Cases

Uploaded on | 1 Views


Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

You are allowed to download the files provided on this website for personal or commercial use, subject to the condition that they are used lawfully. All files are the property of their respective owners.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.

E N D

Presentation Transcript


  1. Week 7 Video 5 Fine-Tuning Large Language Models

  2. Todays Lecture Thanks to Maciej Pankiewicz, Andres Zambrano, Xiner Rachel Liu

  3. Fine-Tuning As noted before, LLMs consist of an extremely complex neural network with an input layer, hidden layers, attention mechanisms, and an output layer Image courtesy of glosser.ca used under Creative Commons Licensing

  4. Fine-Tuning In fine-tuning, the developer provide additional inputs and outputs, and the neural network is modified to better fit the new data Modifying the weights of the network to accomplish this

  5. Fine-Tuning Fine-tuning modifies the weights of this network to better fit a specific data set According to OpenAI s website, modifies hidden layer weights, attention mechanism weights, hyperparameter weights, and output layer weights According to OpenAI can produce clear differences in output for as few as 50 examples The more data, the better Evidence of diminishing returns as amount of data increases

  6. Fine-Tuning Particularly useful if you are sending the same long prompt over and over Replaces an ongoing cost (repeatedly sending the same long prompt) with a one-time cost (fine-tuning) Also increases speed (no need to repeatedly process the same long prompt)

  7. Fine-Tuning A way around deprecation Once you ve fine-tuned a model, OpenAI retains it even if the base model is no longer available WARNING: This is not guaranteed

  8. Use cases Change style, tone, or format to match examples If you just need a general change, prompt engineering is usually easier and faster

  9. Use cases Correct cases where prompting produces incorrect examples Give it the examples and tell it what to say With enough examples, it can start to find the pattern underlying where its responses are not what you want

  10. Use cases Do a task that is hard to explain Write discussion forum post responses like Ryan Baker (JeepyTA)

  11. Pre-condition You need to have examples already Or a process that can generate those examples Such as showing humans LLM responses and asking them to identify incorrect ones, and suggest a better alternative Used by Khanmigo (personal communication, Kristen DiCerbo)

  12. Limitations High initial cost and training time Locks you into a model version when better ones may become available later Changes the output, not the knowledge base If you want the LLM to have additional knowledge, use embeddings

  13. Next up Week 8: Advanced Topics

Related


More Related Content