
Agent-Based Solution for Course Enrollment Model Methodology
Explore the Agent-Based Solution for modeling course enrollment method. Understand the complexity of the system and how agents are generated using a Markov Chain model to predict students' course choices accurately. Discover how PACE program analysis aids in curriculum structuring for efficient progress tracking.
Download Presentation

Please find below an Image/Link to download the presentation.
The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.
You are allowed to download the files provided on this website for personal or commercial use, subject to the condition that they are used lawfully. All files are the property of their respective owners.
The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.
E N D
Presentation Transcript
COURSE ENROLLMENT MODEL METHODOLOGY AN AGENT BASED SOLUTION IAN PYTLARZ SENIOR DATA SCIENTIST SCOTT PU DATA SCIENTIST 1
Agent Based Model Overview A COMPLEX SYSTEM, EMULATING REALITY A COMPLEX SYSTEM, EMULATING REALITY Agent Generation The essential building blocks of the model PACE Program Analysis for Completion Engine Use degree progress in the prediction Probability Generation Put together available data to determine what students may take Modeling Every piece coming together 2
Agent Generation THE ESSENTIAL BUILDING BLOCKS OF THE MODEL THE ESSENTIAL BUILDING BLOCKS OF THE MODEL An agent based model needs agents, who will choose courses in the final model To determine our agents, a Markov Chain model designed by Enrollment Management is used to evaluate enrollments across the university Students in this model are probabilistic (perhaps 80% likely to be a CS major, 20% likely to be MECH), leading to multiple weighted agents being created Agent Jim Jim Jane Jane Major CS MECH CHM ENGL BOAP 02 02 04 04 Agent Weight 0.8 0.2 0.9 0.1 Markov Chain Model 3
Agent Generation THE ESSENTIAL BUILDING BLOCKS OF THE MODEL THE ESSENTIAL BUILDING BLOCKS OF THE MODEL Agents will eventually pick courses, but we don t know beforehand how many courses a student will choose An XGBoost model predicts the number of courses a student will choose based on a inputs similar to a grades model Agent Jim Jim Jane Jane College CS MECH CHM ENGL BOAP 02 02 04 04 # Courses 5 5 4 4 XGBoost 4
PACE Curricular Structure Analysis PROGRAM ANALYSIS FOR COMPLETION ENGINE PROGRAM ANALYSIS FOR COMPLETION ENGINE PACE allows us to produce curriculum in a data structure to enable automatic comparison and extraction of meaningful information, such as progress MGMT 5
How Does PACE Work? DISCIPLINES, BLOCKS, RULES & QUALIFIERS DISCIPLINES, BLOCKS, RULES & QUALIFIERS PACE works by parsing and compiling SCRIBE code There is a data structure for each discipline , which is determined by a student s program of study Discipline Blocks Qualifiers Rules Qualifiers Sub-rules * Qualifiers All of this information, once parsed, is saved into a database for further analysis and use 6
How Is PACE Used Here? DEGREE COMPLETION AS A PREDICTIVE VARIABLE DEGREE COMPLETION AS A PREDICTIVE VARIABLE Using PACE, the progress of each student through their discipline is calculated These progress percentages are binned into 8 bins to allow nulls to be filled in by classification These bins will be used later to help determine courses a student will pick in the model 7
Probability Generation DETERMINING WHAT STUDENTS WILL TAKE DETERMINING WHAT STUDENTS WILL TAKE Before the model gets going, we need to give our agent population choices to make Each student will get a set of probabilities for picking courses Probabilities are based on three criteria: Major & BOAP historical frequencies Major & years enrolled historical frequencies Discipline & PACE completion historical frequencies These three sets of probabilities produce different models weighted together by parameters the data scientist tunes 8
Probability Generation DETERMINING WHAT STUDENTS WILL TAKE DETERMINING WHAT STUDENTS WILL TAKE Using historical course taking patterns, we create a set of probabilities for what students will take Major: CS - BOAP: 02 ? ????? ? ?????? ????? ? ?????? Course CS190 CS191 CS180 MA161 MA261 ENGL106 CS240 Freq 100 100 100 50 50 50 50 Choices 500 500 400 350 450 350 500 p(x) 0.17 0.17 0.22 0.12 0.10 0.12 0.09 ? ? = / ?=1 Include courses in the denominator for students who retook the class, but not who took the class and who did not re- take it This will help estimate retake rates We also do this for pre-requisites 9
Probability Generation DETERMINING WHAT STUDENTS WILL TAKE DETERMINING WHAT STUDENTS WILL TAKE Then, remove courses students can t take, based on pre-reqs, and normalize when assigning to an agent Major: CS - BOAP: 02 Agent: Jim Course CS190 CS191 CS180 MA161 MA261 ENGL106 CS240 Freq 100 100 100 50 50 50 50 Choices 500 500 400 350 450 350 500 p(x) 0.17 0.17 0.22 0.12 0.10 0.12 0.09 Course CS190 CS191 CS180 MA161 MA261 ENGL106 CS240 p(x) 0.20 0.20 0.25 - 0.11 0.14 0.10 Jim Can t Take MA161 10
Modeling BRINGING THE PIECES TOGETHER BRINGING THE PIECES TOGETHER We now have sets of agents, each with a set of course probabilities The model is actually quite simple, with one model for each set of agents: Jim CS Example For epoch in (1 N): Course CS190 CS191 CS180 MA261 ENGL106 CS240 Cume p(x) 0.35 0.50 0.62 0.80 0.97 1.00 For agent in agent_data: For choice in (1 agentnum_courses) Generate number [0,1] Choose course from agent_data Sum course enrollments to epoch Average epochcourse_enrollments Model Rolls 0.7 11
Modeling DETERMINING WHAT STUDENTS WILL TAKE DETERMINING WHAT STUDENTS WILL TAKE Combine models for major/boap, major/year, & disc/compl. Weight: 0.3 Weight: 0.5 Weight: 0.2 Major & BOAP prob Disc & Comp. prob Major & Year prob Course CS190 CS191 CS180 MA161 MA261 ENGL106 CS240 p(x) 0.20 0.20 0.25 - 0.11 0.14 0.10 Course CS190 CS191 CS180 MA161 MA261 ENGL106 CS240 p(x) 0.40 0.10 0.15 - 0.15 0.09 0.10 Course CS190 CS191 CS180 MA161 MA261 ENGL106 CS240 Y/N 0.25 0.25 0.2 - 0.11 0.14 0.05 12
Course Enrollment Model COMPLETE BASIC OVERVIEW COMPLETE BASIC OVERVIEW Course Data & Major/BOAP PACE Agent Generation Choice Probabilities Model Course Avg Enroll Stdev Enroll CS190 102 5.43 CS191 98 3.74 Results 13
Methodological Benefits WHAT DOES THIS COMPLEXITY DO FOR US? WHAT DOES THIS COMPLEXITY DO FOR US? Individual simulations allow for distributions of results By wrapping each simulation up separately, it allows us to see a distribution of how enrollments might be spread out For instance, we can put a standard deviation on each prediction finding which ones we are the most confident in Allows us to take advantage, in the future, of extremely individualized data PACE completion is just one small part of what could be used entire curricular requirements could be built into demand 14
Next Steps & Known Issues HOW DO WE IMPROVE FURTHER? HOW DO WE IMPROVE FURTHER? Known Issues Currently allow students to take a course more than once Proper joint probability was hard to solve, could cause issues if the model is used improperly True multi-part agent predictions are computationally difficult without further optimization Current projections done by using an actual prior population and tweaking it by sampling from that population This means that partial agents as designed can t truly exist in the system not without waiting 8 hours to calculate them, making usage and testing very difficult 15
Next Steps & Known Issues HOW DO WE IMPROVE FURTHER? HOW DO WE IMPROVE FURTHER? Potential Improvements/Next Steps Include curricular requirements as a fourth plank of probability generation Multiple majors are ignored, should be more nuanced Incoming students don t get any credits in the current model (fall problem) Will need another model to fill this gap Freshmen no longer freely choose courses, they are pre- registered. Modelling this system will improve freshmen-heavy course predictions 16
Discussion HOW DO WE IMPROVE FURTHER? HOW DO WE IMPROVE FURTHER? This methodology represents, currently, a foundation to be built upon Current inputs boil down to being very similar to inputs into a simpler model, resulting in only small improvements to the error rates in current predictions What other potential data could be added to take advantage of the nuance allowed by the methodology? We discussed a few examples we intend to try, but there are virtually unlimited potential data sources 17