Large Language Models in Model Analysis

Large Language Models in Model Analysis
Slide Note
Embed
Share

Large language models (LLMs) are powerful tools trained on vast text data, shaping the future of work, labor markets, education, and research. Explore the potential of LLMs like ChatGPT as valuable assets in the system dynamics toolkit to support and enhance various work aspects, lesson planning, research & teaching environments, and economic simulations.

  • LLMs
  • Language Models
  • Model Analysis
  • System Dynamics
  • Research

Uploaded on Mar 09, 2025 | 0 Views


Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

You are allowed to download the files provided on this website for personal or commercial use, subject to the condition that they are used lawfully. All files are the property of their respective owners.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.

E N D

Presentation Transcript


  1. SIMULATION AND CHATGPT: LARGE LANGUAGE MODELS AS TOOLS FOR MODEL ANALYSIS Ivan ula , Tabea Berberena , Ksenia Keplinger and Maria Wirzberger University of Stuttgart, Cluster of Excellence EXC 2075 Data-Integrated Simulation Science , Stuttgart, Germany University of Stuttgart, Interchange Forum for Reflecting on Intelligent Systems (IRIS), Stuttgart, Germany Max Planck Institute for Intelligent Systems, Organizational Leadership & Diversity, Stuttgart, Germany

  2. Large language models (LLMs) powerful models trained on massive amounts of text data, capable of generating human-like text, answering questions, and performing various language-related tasks with high accuracy. made significant advancements in natural language processing in recent years (Kasneci et al., 2023), will significantly influence the future of work, labor markets (Eloundou et al., 2023), as well as education and research (Kasneci et al., 2023). Do LLMs (such as ChatGPT) have the potential to become a valuable tool in the SD toolkit? If yes, how can it be used to support and improve our work? 2

  3. Lesson planning How else can we use LLMs in SD? THE USE OF LLMs IN RESEARCH AND TEACHING ENVIRONMENTS Language learning Education in systems thinking Writing tasks THE USE OF LLMs IN SD Generating CLDs Assessment Evaluation 3

  4. Large Language Models as Simulated Economic Agents (Horton, 2023) Most economic research takes one of two forms: What would homo economicus do?" and What did homo sapiens actually do? Newly developed large language models (LLM) - because of how they are trained and designed - can be thought of as implicit computational models of humans (a homo silicus). These models can be used the same way economists use homo economicus: they can be given endowments, put in scenarios, and then their behavior can be explored. LLMs - by nature of their training and design - are (1) computational models of humans and (2) likely possess a great deal of latent social information. 4

  5. Hooked on artificial agents: a systems thinking perspective ( ula et al., 2023) 5 https://doi.org/10.3389/frbhe.2023.1223281

  6. 6

  7. At a point in time the workload increases in a step-like manner The workload stays at the increased level Same as Scenario 1 The AA has double the efficiency compared to Scenario 1 Same as Scenario 1 The human agent has either high or low initial trust Same as Scenario 1 The human agent is significantly more sensitive to the use of AA Scenario 1 Step-up in workload Scenario 2 A more efficient AA Scenario 4 Emotional and social response Scenario 3 Initial trust 7

  8. Act like a software developer. Your workload has a positive causal relationship with your effort; meaning that the more work you have, the more effort you need to expand. The effort has a positive causal relationship with your performance; meaning the more effort you expend, the more programming tasks you can solve. Your performance has a negative causal relationship with your workload; meaning the more programming tasks you solve, the lower your workload. These relationships form a balancing feedback loop. ... Assuming neutral initial trust towards artificial agents, forecast your use of artificial agents in the short-term and the long-term if you experience a step increase in your workload. Use the relationships identified above to explain your forecast. Instructing the ChatGPT We instructed the ChatGPT to act like a software developer and proceeded to explain the qualitative model to it one causal link after another. For each causal link, we explained what happens if the cause variable increases. Once we covered all causal links in a given feedback loop, we would emphasize that to the ChatGPT. Finally, we instructed ChatGPT to assume neutral level of trust and forecast the short-term and long-term consequences of a step increase in the workload. We did not provide any instruction which performance measures to focus on. 8

  9. SHORT-TERM FORECAST With a sudden increase in workload, there would be a direct increase in effort expended to manage the tasks. Increased effort would likely lead to higher performance initially, as more tasks are being completed. However, the increased workload would also trigger a rise in the use of artificial agents to handle some tasks, alongside the increased effort. This surge in the use of artificial agents would initially alleviate some workload pressure. As workload decreases due to the combined efforts and artificial agents' assistance, emotional and social responses may also decrease temporarily, leading to more rational decision-making. The rational decision-making would further boost performance in the short term. 9

  10. LONG-TERM FORECAST Over time, as workload continues to be managed through the combined efforts of the developer and artificial agents, trust in the artificial agents would likely increase due to their consistent assistance in task completion. Increased trust would lead to a further increase in the use of artificial agents, as the developer becomes more reliant on their support. With continued reliance on artificial agents, emotional and social responses may diminish further, leading to even more rational decision-making. The cycle continues with improved performance due to rational decision-making and increased trust in artificial agents. 10

  11. Promising results ChatGPT is clearly capable of remembering the model structure and using system dynamics language to make qualitative forecasts based on that structure. We observed a significant amount of correspondence between simulated behavior and ChatGPT forecasts and, in many cases, it was able to accurately refer to the model structure to explain its forecasts. It is able to incorporate new information as we alternate between scenarios, as well as remove currently active endowments and use previous ones when necessary. It demonstrated its potential use as an expert replacement that you can interact with and exchange information about a particular model. 11

  12. Concerns Most notably, it is clear that ChatGPT does not really understand system dynamics and related concepts, which is not surprising given its probability-based responses. The reinforcing feedback loop between the use of artificial agents, emotional and social response, undesirable behavior, effort, performance, and workload would continue to stabilize the system, maintaining a balance between workload, effort, performance, and the use of artificial agents in the long term. Furthermore, ChatGPT will sometimes go beyond the model limits and reach for the wealth of information it is trained on when giving responses. Despite the efficiency of the artificial agent, the developer may prefer to handle tasks themselves initially due to skepticism or lack of confidence in the artificial agent's abilities. The most troubling observation is that it does not seem to be capable of capturing some of the more nuanced and counterintuitive behaviors that arise from the complex interplay of feedback loops. In the long term, the doubling of efficiency would continue to have a profound effect on the use of artificial agents. With tasks being completed more efficiently, trust in the artificial agents would likely increase even further, leading to a sustained reliance on their assistance. 12

  13. BACKUP SLIDES 13

  14. The use of artificial agents has a negative causal relationship with your emotional and social response; meaning that the more you use artificial agents, the less emotionally you react and the less concerned about social rules and norms you are. Your emotional and social response has a negative causal relationship to your rationality of interactions; meaning that the more emotional and social your response is, the less rational you behave in your decision-making. Your rationality of interactions has a positive causal relationship with your performance; meaning that the more rational you are with your decision-making, the more programming tasks you can solve. The relationships between the use of artificial agents, emotional and social response, the rationality of interactions, performance, and workload form another balancing feedback loop. Prompt 1 Act like a software developer. Your workload has a positive causal relationship with your effort; meaning that the more work you have, the more effort you need to expand. The effort has a positive causal relationship with your performance; meaning the more effort you expend, the more programming tasks you can solve. Your performance has a negative causal relationship with your workload; meaning the more programming tasks you solve, the lower your workload. These relationships form a balancing feedback loop. The relationships between the use of artificial agents, emotional and social response, performance, and trust form a reinforcing feedback loop. Your emotional and social response also has a negative causal relationship with your propensity for undesirable behavior; meaning that the more emotional and social your response is, the less likely you are to engage in any problematic behaviors. Your propensity for undesirable behavior has a negative causal relationship with your effort; meaning that the more you engage in problematic behaviors, the less effort you expend. The relationships between the use of artificial agents, the emotional and social response, undesirable behavior, effort, performance, and workload form a reinforcing feedback loop. In addition to effort, the workload has a positive causal relationship with the use of artificial agents; meaning that if the workload increases, you may use artificial agents to handle some of the workload in addition to increasing your effort. The use of artificial agents has a negative causal relationship with your workload; meaning that increasing the use of artificial agents decreases the amount of workload you have. These relationships form another balancing feedback loop. The use of artificial agents also depends on your level of trust in artificial agents. There is a positive causal relationship between trust and the use of artificial agents; meaning that the more you trust the artificial agent, the more you are willing to use it. Your trust in artificial agents depends on your performance. Your performance has a positive causal relationship with trust in artificial agents; meaning that the more programming tasks you can solve using artificial agents, the more trust you will have in artificial agents. The relationships between the use of artificial agents, workload, effort, performance, and trust form another balancing feedback loop. The relationships between the use of artificial agents, the emotional and social response, undesirable behavior, effort, performance, and trust form a balancing feedback loop. Assuming neutral initial trust towards artificial agents, forecast your use of artificial agents in the short-term and the long-term if you experience a step increase in your workload. Use the relationships identified above to explain your forecast. 14

  15. Prompts 2-4 Prompt 2: Imagine that you can double the efficiency of the artificial agent. How would that change your forecast? Prompt 3: Imagine you are using the artificial agent with the initial efficiency. How would having low and high levels of initial trust in artificial agents, rather than neutral, change your forecast? Prompt 4: Imagine having neutral trust again. If your emotional and social response was significantly more sensitive to the use of artificial agents, how would that change your forecast? 15

  16. 16

  17. 17

  18. 18

  19. 19

  20. Long-term Forecast: Scenario 1 Over time, as workload continues to be managed through the combined efforts of the developer and artificial agents, trust in the artificial agents would likely increase due to their consistent assistance in task completion. Increased trust would lead to a further increase in the use of artificial agents, as the developer becomes more reliant on their support. With continued reliance on artificial agents, emotional and social responses may diminish further, leading to even more rational decision-making. Given the complex interplay of variables described in the scenario, let's break down the effects of a step increase in workload on the use of artificial agents in the short- term and long-term: The cycle continues with improved performance due to rational decision-making and increased trust in artificial agents. Short-term Forecast: With a sudden increase in workload, there would be a direct increase in effort expended to manage the tasks. However, as emotional and social responses decrease and reliance on artificial agents grows, there may be a slight uptick in undesirable behavior due to reduced vigilance. Increased effort would likely lead to higher performance initially, as more tasks are being completed. This increase in undesirable behavior might lead to a slight decrease in effort expended. However, the increased workload would also trigger a rise in the use of artificial agents to handle some tasks, alongside the increased effort. Nonetheless, the reinforcing feedback loop between the use of artificial agents, emotional and social response, undesirable behavior, effort, performance, and workload would stabilize the system, maintaining a balance between workload, effort, performance, and the use of artificial agents in the long term. This surge in the use of artificial agents would initially alleviate some workload pressure. As workload decreases due to the combined efforts and artificial agents' assistance, emotional and social responses may also decrease temporarily, leading to more rational decision-making. Overall, in both the short-term and long-term, a step increase in workload would lead to a significant rise in the use of artificial agents, accompanied by improvements in performance, but potentially with some fluctuations in emotional and social responses and undesirable behavior. The rational decision-making would further boost performance in the short term. 20

  21. Long-term Forecast: Scenario 2 In the long term, the doubling of efficiency would continue to have a profound effect on the use of artificial agents. With tasks being completed more efficiently, trust in the artificial agents would likely increase even further, leading to a sustained reliance on their assistance. Doubling the efficiency of the artificial agent would have significant implications for the forecast: Short-term Forecast: With a doubling of efficiency, the artificial agent would be able to handle tasks more swiftly and effectively, reducing the workload on the developer even further. This increased trust, coupled with the efficiency gains, would result in a continued high use of artificial agents in task completion. As reliance on artificial agents grows, emotional and social responses may continue to diminish, leading to consistently rational decision-making. This increased efficiency would lead to a rapid decrease in workload, as tasks are completed at a faster rate. The reinforcing feedback loop between the use of artificial agents, emotional and social response, undesirable behavior, effort, performance, and workload would continue to stabilize the system, maintaining a balance between workload, effort, performance, and the use of artificial agents in the long term. As workload decreases, the need for the developer to expend effort would also reduce, freeing up more time for other tasks or relaxation. The combination of reduced workload and effort would likely lead to a decrease in the use of artificial agents in the short term, as the developer may rely less on their assistance when the workload is manageable without it. Overall, doubling the efficiency of the artificial agent would lead to a more pronounced decrease in workload and effort in the short term, followed by a sustained high use of artificial agents in the long term, resulting in improved performance and a stable system. However, despite the decrease in artificial agent usage, performance would likely remain high due to the increased efficiency and the developer's ability to focus on more complex tasks. 21

  22. High Initial Trust: Scenario 3 Conversely, with high initial trust in artificial agents, the developer would be more inclined to rely heavily on their assistance from the outset. Let's explore how having low and high levels of initial trust in artificial agents would impact the forecast: Despite the initial efficiency of the artificial agent, the developer's high trust would lead to a rapid uptake of their assistance, resulting in a significant reduction in workload in the short term. Low Initial Trust: With low initial trust, the developer would be hesitant to rely on artificial agents extensively, even with their initial efficiency. As workload decreases, the developer may also experience a reduction in effort expended, as they rely more on the artificial agent's capabilities. Despite the efficiency of the artificial agent, the developer may prefer to handle tasks themselves initially due to skepticism or lack of confidence in the artificial agent's abilities. With workload and effort reduced, performance may improve rapidly in the short term, as the developer can focus on higher-value tasks. Additionally, with high trust in artificial agents, emotional and social responses may diminish more quickly, leading to more rational decision-making and further performance improvements. This reluctance to use artificial agents would result in a slower uptake of their assistance, leading to a less significant reduction in workload in the short term. In the long term, the high trust in artificial agents would likely lead to a sustained high use of their assistance, resulting in continued performance improvements and a stable system. As workload remains relatively high and trust in artificial agents does not increase quickly, emotional and social responses may also remain more pronounced, leading to potentially less rational decision-making. In summary, initial levels of trust in artificial agents can significantly influence the rate of adoption, workload reduction, emotional and social responses, and ultimately, the performance of the system in both the short term and the long term. The slower adoption of artificial agents and the persistence of emotional and social responses could lead to a less efficient overall system in the short term, with performance improvements being limited. In the long term, trust in artificial agents may gradually increase as the developer observes their effectiveness, leading to a more balanced use of artificial agents and potentially improved performance over time. 22

  23. Long-term Forecast: Scenario 4 In the long term, the impact of heightened sensitivity to the use of artificial agents would continue to influence the system dynamics. If emotional and social responses were significantly more sensitive to the use of artificial agents, it would have notable implications for the forecast: As trust in artificial agents gradually increases due to their demonstrated effectiveness, emotional and social responses may stabilize, leading to more rational decision-making. Short-term Forecast: However, the sensitivity to the use of artificial agents could still lead to periodic fluctuations in emotional and social responses, especially during times of increased reliance on these agents. With heightened sensitivity to the use of artificial agents, even with neutral trust, the developer's emotional and social responses would be more strongly influenced by their reliance on these agents. Despite these fluctuations, the reinforcing feedback loop between the use of artificial agents, emotional and social response, undesirable behavior, effort, performance, and workload would still stabilize the system over time. As the developer increasingly relies on artificial agents to handle tasks, their emotional and social responses may fluctuate more prominently. Initially, this heightened sensitivity could lead to increased emotional reactions, potentially including feelings of unease or discomfort with relinquishing control to artificial agents. Overall, while heightened sensitivity to the use of artificial agents may initially slow down the adoption and efficiency gains, the system would eventually reach a stable state with continued performance improvements and a balanced use of artificial agents. These emotional responses could impact decision-making, potentially leading to hesitancy or resistance to fully utilize artificial agents, despite their efficiency. In summary, increased sensitivity to the use of artificial agents would likely introduce more variability in emotional and social responses, potentially impacting the rate of adoption and efficiency gains in the short term, but the system would eventually stabilize and achieve performance improvements in the long term. Consequently, the reduction in workload and effort may be slower in the short term compared to a scenario with less sensitivity to the use of artificial agents. Performance improvements may also be limited initially due to the slower adoption of artificial agents and the potential for emotional responses to interfere with rational decision-making. 23

More Related Content