
Harnessing Big Text Data: CS410 DSO Text Information Systems Course Overview
Explore the CS410 DSO Text Information Systems course taught by ChengXiang Cheng Zhai at the University of Illinois at Urbana-Champaign. Dive into the motivation behind harnessing big text data, learn about the main techniques for data retrieval and mining, and understand the goals and personalized learning approach of the course. Discover the design, format, and grading structure of CS410, emphasizing theory and practice with a focus on both basic concepts and practical skills. Engage in collaborative learning, forum-based interactions, and community digital library resources to enhance your understanding.
Download Presentation

Please find below an Image/Link to download the presentation.
The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.
You are allowed to download the files provided on this website for personal or commercial use, subject to the condition that they are used lawfully. All files are the property of their respective owners.
The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.
E N D
Presentation Transcript
CS410 DSO: Text Information Systems: Course Introduction Instructor: ChengXiang Cheng Zhai Department of Computer Science University of Illinois at Urbana-Champaign 1
Motivation: Harnessing Big Text Data Text data is ubiquitous and growing rapidly Internet Blogs News Email Literature Twitter Many Knowledge applications! 2
Main Techniques for Harnessing Big Text Data: Text Retrieval + Text Mining Text Text Mining Retrieval Big Text Data Big Text Data Small Relevant Data Small Relevant Data Many Knowledge Applications 3
Design of CS410: Overview Online Videos + High Engagement MOOC 2 MOOC 1 Project & Tech Review Hi Text Mining Text Retrieval Big Text Data Big Text Data Small Relevant Data Small Relevant Data Many Knowledge Applications 4
Design of CS410: Goals Emphasize both theory and practice Theory: basic concepts and general principles are applicable to all applications Lectures + Quizzes + Exams Practice: specific practical skills are immediately useful Programming assignments Integration of theory and practice Course projects 5
Design of CS410: Goals Personalized learning Self paced + Choices of project & technology review Collaborative learning Forum-based interactions and collaboration Community Digital Library (CDL): https://textdata.org/about Group projects, group technology reviews Students use a Chrome Extension to regularly save useful Web resources related to CS410 to a CS410 Digital Library (and earn extra credit), which they can all search and browse to find useful supplementary materials from the library (e.g. a useful explanation of a difficult concept). 6
Design of CS410: Format & Grading Synchronous Weekly Office Hours via Video-Teleconferencing Extra Credit: + 5% Asychronous Question Answering & Discussion via Forums & Collaborative Learning via Community Digital Library MOOC 2 MOOC 1 Course Project Tech Review (4 credit hr. only) Text Mining Text 20% Retrieval 5% Topic Selection Hi Lecture Videos Quizzes Lecture Videos Quizzes Proposal Progress Report 5% 25% 5% Exam Exam 30% Software Deposit Presentation 65% Programming Programming 25% 20% 7
You have Complete Control over Your Grade! A+: [95,100] A: [90,94] A-: [85, 89] B+: [80, 84] B: [75, 79] B-: [70,74] C: [60, 69] D: [55,59] F: <55 5% Extra Credit would help move your grade up by one bracket 8
Your Work Load Aug Nov Sept Oct Dec First day of instruction Last day of instruction Thanksgiving Break Lecture Videos Quizzes Proctored Exams Programming Assignment Last 2 Weeks Project Technology Review 9
Forum Discussion Forum (Campuswire) is the primary way of interactions and engagement Asynchronous discussion enables participation of everyone Enables faster question answering without waiting until an office hour Facilitates identification of difficult concepts to be covered in office hours 10
Protocol of Question Answering As soon as you have a question or issue to discuss, post it immediately on Forum In general, you should use the option post to everyone unless you have a private question (e.g., about your grade) in which case, you can choose post to Instructors & TAs If the question is not answered in a timely manner on Forum or addressed adequately, email the question to all of us (i.e., the instructor and TAs) using a subject line containing the keyword CS410DSO If you don t receive a reply from us by email in a timely manner, join an office-hour (i.e., a video-conference) 11
Format of Office Hours The TAs and the instructor will hold weekly office hours at published time slots using video-conferencing (Zoom); we will generally each hold a 1-hour office hour in each week. Students can join/leave an office hour as needed at any time Priority list in descending order: High: Issues posted on Forum, but unresolved even after email communications with the TAs/Instructor Medium: Other unresolved issues on Forum Low: Any questions or issues not posted on Forum, brought by a student joining an office hour (first come, first serve) 12
How to Get the Most out of CS410 DSO? Plan ahead based on your own schedule, and act early Allocate sufficient time for the preparation of two proctored exams Complete quizzes and programming assignments ahead of time whenever possible Post questions on Forum immediately whenever you have difficulty in understanding any part of the course materials Leverage collaborative learning Actively participate in forum discussions(you ll learn from reading posts on Forums) Contribute to the CS410 Community Digital Library (CDL) by regularly saving useful Web resources related to the course (you ll benefit from content saved by peers) Earn up to 5% extra credit by 1) making effort to answer others questions on Forums, and/or 2) making effort to contribute content to CS410 CDL 13
If you have already taken the MOOC(s) You can and are encouraged to finish all the quizzes and most programming assignments much earlier However, you cannot take an exam earlier, and some programming tasks may require synchronization and thus cannot be finished earlier than scheduled Enjoy more time to work on your course projects (if you want to)! 14
For more information, visit the course website: https://courses.engr.illinois.edu/cs410/fa2023 15