
Exploring Scholarly Topic Development Through Diverse Data Sources
Discover the project "Cascades, Islands, and Streams" by a team of researchers from Indiana University, University of Wolverhampton, and University of Quebec at Montreal. The project aims to correct science bias in mapping scholarly activities by integrating various datasets and using method triangulation. Explore how topics emerge and develop across different scholarly activities and research questions. Diverse datasets and topics like Cognitive Science and Social Network Analysis are analyzed to understand the nature of topic development. Join the exploration of topic emergence differences and the impact of activity types on topic lifecycles.
Download Presentation

Please find below an Image/Link to download the presentation.
The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.
You are allowed to download the files provided on this website for personal or commercial use, subject to the condition that they are used lawfully. All files are the property of their respective owners.
The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.
E N D
Presentation Transcript
CASCADES, ISLANDS AND STREAMS INDIANA UNIVERSITY BLOOMINGTON UNIVERSITY OF WOLVERHAMPTON UNIVERSITY OF QUEBEC AT MONTREAL PRESENTED BY: DR KAYVAN KOUSHA (WOLVERHAMPTON)
PROJECT TEAM Indiana University Bloomington Cassidy Sugimoto (head) Ying Ding Sta a Milojevi University of Wolverhampton Mike Thelwall Kayvan Kousha (presenting) University of Quebec at Montreal Vincent Larivi re
http://mapofscience.com/nih.html PROJECT IDEA To correct the science bias in maps of science that rely upon journal citations
OUR PROPOSAL 1. Integrate several datasets representing a broad range of scholarly activities (not just journal publishing) Use method triangulation to explore the lifecycle of topics within and across a range of scholarly activities 2. 3. Develop transparent tools and techniques to enable future predictive analyses
EXPLORE TOPIC EMERGENCE DIFFERENCES Occurs consistently in one type of activity, then cascades in a linear fashion to other areas OR Emerges in one area then flows into other areas (streams) OR Emerges in different places and remains in separate islands
RESEARCH QUESTIONS What is the nature of topic development in relation to core scholarly activities? How does the type of activity in which a topic appears impact the lifecycle and duration of that topic?
DATASETS ProQuest Dissertation and Theses database National Science Foundation grant database Social Science and Humanities Research Council of Canada grant database Web of Science (Century of Science database) Internet discussions Blogs Twitter Mendeley
TOPICS Cognitive Science Digital Humanities History of Science Social Network Analysis We will analyse these four broad topic areas
METHODS Word analysis Words used as proxies for topics to investigate topic flows over time Topic modelling Identifying topics by statistical analysis of word co- occurrences Burst detection Identifying sub-topic emergence by detecting significant increases in word frequencies
EXAMPLE OF FINDINGS:- TEDTALKS: ANALYSIS OF IMPACT CASSIDY SUGIMOTO & MIKE THELWALL Sugimoto, C.R. & Thelwall, M. (in press). Scholars on soap boxes: Science communication and dissemination via TED videos. Journal of the American Society for Information Science and Technology.
MOTIVATION New popular genre Public dissemination of science Educational videos Infotainment
RESEARCH QUESTIONS In which communication forms do TEDTalks have the greatest impact? Which disciplinary types of TEDTalks have the greatest impact? Do different communication forms have similar types of impact?
Metric Minimum Mean Maximum Total Valid 44,441 517,437 9,946,996 620,406,446 1,199 TED web site views 462 99,184 3,991,983 111,681,275 1,126 YouTube views Blog citations (Google blog search estimates) 0 9,073 441,000 10,905,376 1,202 2 900 26,591 1,013,231 1,126 YouTube Likes 3 767 38,139 863,458 1,126 YouTube Favorite count 0 368 21,703 414,311 1,126 YouTube comments (count hint) 8 187 5,921 224,629 1,199 TED web site comments 0 69 1,456 78,053 1,126 YouTube Dislikes Online mentions related to academic syllabi Online mentions in PDF and Word documents 0 2 50 2,070 1,202 0 0 49 592 1,202 0 0 75 505 1,202 Google Scholar citations 0 0 18 434 1,202 Google Books citations Online mentions in PowerPoint presentations 0 0 238 392 1,202 0 0 30 231 1,202 Mendeley readers 0 0 5 47 1,202 Web of Knowledge citations 0.260 0.900 1.000 - 1,126 YouTube Like proportion
Art & Design rank sum (194) Science & Technology rank sum (405) Significance of rank sum differences Others rank sum (440) Metric 468.33 419.4 526.49 499.07 532.22 495.76 0.036 0.192 TED web site views YouTube views 467.42 524.5 537.9 0.022 Blog citations 338.94 513.31 517.77 0* YouTube comments 374.81 521.61 578.36 0* TED web site comments 278.77 355.09 350.86 0.001* Online mentions related to academic syllabi 299.8 343.28 351.84 0.007 Online mentions in PDF and Word documents 327.26 354.01 329.78 0.009 Google Scholar citations 313.54 359.53 331.09 0.005 Google Books citations 341.73 335.94 339.31 0.796 Online mentions in PowerPoint presentations 330.96 343.31 337.64 0.458 Mendeley readers 332.61 342.05 338.01 0.337 Web of Knowledge citations 428.13 544.53 448.13 0* YouTube Like proportion
Spearman's rho Google Scholar Mend- eley Google Books PDF Power- Point YouTube views YouTube comments YouTube Favourites YouTube Like prop. Ted site views TED site comments and doc WoK Syllabi Blogs 1 .264 .103 .186 .157 .174 .110 .133 .099 .062 .123 .076 .112 .089 WoK .264 1 .198 .408 .272 .270 .089 .191 .202 .145 .229 .132 .239 .194 Google Scholar .103 .198 1 .231 .215 .205 .178 .160 .133 .081 .166 .102 .176 .139 Mendeley .186 .408 .231 1 .315 .312 .175 .276 .234 .150 .273 .087 .252 .197 Google Books .157 .272 .215 .315 1 .382 .165 .230 .245 .196 .285 .167 .276 .241 PDF and doc .174 .270 .205 .312 .382 1 .160 .437 .353 .322 .425 .162 .440 .405 Syllabi .110 .089 .178 .175 .165 .160 1 .095 .100 0.057 .127 0.035 .124 .082 PowerPoint .133 .191 .160 .276 .230 .437 .095 1 .496 .427 .554 .255 .610 .498 Blogs .099 .202 .133 .234 .245 .353 .100 .496 1 .681 .902 .368 .724 .540 YouTube Views YouTube comments .062 .145 .081 .150 .196 .322 0.057 .427 .681 1 .651 .064 .560 .728 YouTube Favourites .123 .229 .166 .273 .285 .425 .127 .554 .902 .651 1 .464 .773 .579 YouTube Like prop. .076 .132 .102 .087 .167 .162 0.035 .255 .368 .064 .464 1 .369 .169 .112 .239 .176 .252 .276 .440 .124 .610 .724 .560 .773 .369 1 .683 Ted site views TED site comments .089 .194 .139 .197 .241 .405 .082 .498 .540 .728 .579 .169 .683 1
SOME FINDINGS There was a general consensus about the most popular videos as measured through views or comments on YouTube and the TED site. Most videos were found in at least one online syllabus and videos in online syllabi tended to be more viewed, discussed and blogged. Less liked videos generated more discussion. Science and technology videos presented by academics were more liked than those by non academics ->academics are not disadvantaged in TED
NEXT STEP Integrating web, citation and dissertation data into one huge analysis