Assessing Human-Mediated Current Awareness Services at ISI 2015 Symposium

Assessing Human-Mediated Current Awareness Services at ISI 2015 Symposium
Slide Note
Embed
Share

This presentation delves into the intellectual editing process in digital libraries, focusing on how editors create subject-specific reports. The case study of RePEc, its statistics, and the New Economics Papers (NEP) service are discussed, highlighting manual selection processes and editorial efforts.

  • Symposium
  • Digital Libraries
  • Subject-Specific Reports
  • RePEc
  • NEP

Uploaded on Mar 10, 2025 | 0 Views


Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

You are allowed to download the files provided on this website for personal or commercial use, subject to the condition that they are used lawfully. All files are the property of their respective owners.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.

E N D

Presentation Transcript


  1. Assessing a human mediated current awareness service International Symposium of Information Science (ISI 2015) Zadar, 2015-05-20 Zeljko Carevic1, Thomas Krichel2and Philipp Mayr1 1firstname.lastname@gesis.org 2lastname@openlib.org

  2. Slide 2 / 31 Outline 1. Introduction 2. RePEc and NEP 3. Results 3.1 Editing time 3.2 Indicators for report success 3.3 Editing effort 4. Conclusion and Outlook

  3. Slide 3 / 31 Motivation Thomas Krichel, the founder of RePEc, visited GESIS Cologne in Oct. 2014 Sharing his Russian souvenir ~100 GB of XML log files

  4. Slide 4 / 31 1. Introduction Current awareness in digital libraries To inform users / subscribers about new / relevant acquisitions in their libraries [1]. Current awareness services allow subscribers to keep up to date with new additions in a certain area of research. Selection of relevant documents can be done (semi- )automatically or manually. For this work we focus on the intellectual editing process Aim of this work: How do editors work when creating a subject specific report in Digital Libraries (DL)?

  5. Slide 5 / 31 2. Use case: RePEc RePEc (Research Papers in Economics) is a DL for working papers in economics research. Covers metadata for working papers and journal articles. Usually document metadata contains links to full texts

  6. Slide 6 / 31 2. RePEc statistics Contr. Archives Documents Full text Documents Regist. Authors Abstract views (April 2015) ~1,700 1.77 mio 1.63 mio ~45,000 >2 mio 1800 1600 1400 Number of documents 1200 1000 800 600 400 200 0 1996 1998 2000 2002 2004 2006 Year 2008 2010 2012 2014 2016

  7. Slide 7 / 31 2. Current awareness service NEP NEP (New Economics Papers) is a current awareness service for new additions in RePEc. NEP covers subject specific reports from over 90 specific fields. Business, Economic and Financial History Public Economics Social Norms and Social Capital Issues are sent to subscribers via E-Mail, RSS and Twitter Reports to new additions are generated by subject specific editors. Relevant document selection is done manually by the editor!

  8. Slide 8 / 31 Contains all new RePEc docs Created roughly on weekly base Contains avg. 488 doc Nep-all Manual selection of relevant documents Selects is a time consuming task. Selects Selects Selects Nep-acc Nep-afr Nep-upt Nep-ure Sends issue Sends issue Sends issue Sends issue

  9. Slide 9 / 31 ERNAD ERNAD (Editing Reports on New Academic Documents) is a purposed built system Re-rank nep-all for each editor based on the specific report topic Looking at past issues of a report to produce a ranked nep-all If presorting works well editors select highly ranked documents from nep-all

  10. Slide 10 / 31 ERNAD example for Nep-Africa (NEP-AFR) Nep-all unsorted Nep-all presorted 1. Tax compliance.. 2. Mental accounting.. 212. Ethnic ..in Africa 317. Sino-African relations: 1. Ethnic ..in Africa 2. Sino-African relations: 50. Tax compliance.. 51. Mental accounting..

  11. Slide 11 / 31 Editing stages

  12. Slide 12 / 31 Research questions RQ 1: How long is the editing duration? RQ 2: What influences the success of a report? Editing duration Issue size RQ 3: How much effort is invested for selecting and sorting papers per issue? Precision @ N Relative search length

  13. Slide 13 / 31 RQ 1: Editing time How much time do editors invest to create a report?

  14. Slide 14 / 31 Pre-selection Editing an issue can be interrupted This would distort the results Exclude interrupted issues by separating the edit duration in 3-minute chunks

  15. Slide 15 / 31 Pre-selection 9000 8000 7000 6000 Number of issues Limit edit time < 90 min 5000 4000 3000 2000 1000 0 3 6 9 12 15 18 2124 27 30 33 36 3942 3-minute chunks 45 48 51 54 57 6063 66 69 72 75 78 8184 87 90 >90

  16. Slide 16 / 31 RQ 1: Editing time 60 Avg. editing time Max. 53 minutes NEP-ETS (Economic time series) 50 Average editing time in minutes Avg. 15.5 minutes. (sd = 10.1) 40 30 Min. 2.5 minutes NEP- RES (Resource economics) 20 10 0 nep-ets nep-gro nep-opm nep-pke nep-cba nep-hea nep-rmg nep-geo nep-hap nep-tid nep-dem nep-soc nep-cse nep-net Report nep-ifn nep-lab nep-ltv nep-for nep-law nep-mig nep-cdm nep-mon nep-exp nep-neu nep-ino nep-mst nep-ore nep-fmk nep-ara nep-mkt

  17. Slide 17 / 31 Summarize RQ 1 Average editing time is comparable low with 15.5 minutes Huge scattering between the reports: Min. 2.5 minutes Max. 53 minutes

  18. Slide 18 / 31 RQ 2: Influences to successful reports Popularity of a report can be measured by the number of subscribers. Huge scattering between number of subscribers per report Max. 6859 NEP-HIS Business, Economic and Financial History Min. 75 NEP-CIS Confederation of Independent States Factors influencing reports success for example: topic, age of a report.. Does the issue size or the editing time influence the report success?

  19. Slide 19 / 31 Editing time 7000 Avg. edit time Avg. number of subscribers 6000 Education 2198 sub. (avg. 836) 5000 Number of subscribers Project, Program and Portfolio Management 43,5 min (avg. 15.5) 4000 3000 2000 1000 0 0 10 20 30 40 50 60 Average editing time

  20. Slide 20 / 31 Issue size 7000 Avg. issue size Avg. number of subscribers Sports issue size 2.5 (avg. 12.4) 6000 Demographic Economic issue size 21 (avg. 12.4) 5000 Number of subscribers 4000 3000 2000 1000 0 0 10 20 30 40 50 60 Average issue size

  21. Slide 21 / 31 Summarize RQ 2 There is no correlation between: Issue size and number of subscribers Editing time and number of subscribers We assume that the success of a report is mainly driven by topic and age.

  22. Slide 22 / 31 RQ 3: Effort in selecting and sorting How much effort is invested in selecting and sorting relevant documents from nep-all? Two measures are used: Precision @N Relative search length

  23. Slide 23 / 31 Precision @ N How many of the top n documents from pre-sorted nep-all are selected for the issue? N set to: 5, 10, 15, 20 We only consider issues where issue size > N A document is relevant if its index position in nep-all is < N.

  24. Slide 24 / 31 Example: P@ 5 M={(D1, 4), (D2, 1), (D3, 7), (D4, 3), (D5, 9)} P@5 for issue I in report J = Editors vary between using pre-sorted and un-sorted nep-all. Therefore: Only consider issues with pre-sort usage > 50

  25. Slide 25 / 31 Results for P@N Avg. P@5 (82 rep) Avg. P@10 (64 rep) Avg. P@15(50rep) Avg. P@20 (31 rep) 0.77 0.80 0.80 0.82 Max. found for nep-env (Environmental Economics) with P@5 = 0.99 Min. found for nep-cba (Central Bank) with P@5 = 0.35

  26. Slide 26 / 31 Summarize P@N Editors work comfortably with the presorting in nep-all. The number of papers per issue has no significant influence for the precision.

  27. Slide 27 / 31 Relative Search Length We know how many of the top N document from nep-all selected. To what depth do editors inspect nep-all? Ratio between the highest index position (hin) of the last relevant document in nep- all and the length of nep-all

  28. Slide 28 / 31 Example RSL Editor is given a nep-all containing 300 documents. M={(D1, 4), (D2, 10), (D3, 7)} RSL = 10/300 We assume that the editor has inspected nep-all to document 10.

  29. Slide 29 / 31 Relative Search Length 0.35 Avg. RSL NEP-MAC (Macroeconomics) RSL = 0.35 0.3 Average RSL per Report 0.25 Avg. RSL = 0.08 0.2 NEP-SPO 0.15 (Sports and Economics) RSL = 0.01 0.1 0.05 0 nep-mac nep-dem nep-cwa nep-eur nep-iue nep-cbe nep-afr nep-mic nep-bec nep-int nep-knm nep-com nep-reg nep-cdm Report nep-ifn nep-tid nep-eff nep-ino nep-upt nep-edu nep-for nep-neu nep-cis nep-ltv nep-net nep-dev nep-ppm nep-spo

  30. Slide 30 / 31 Summarize RSL The relative search length is comparable low with 0.08 Editors select papers from the very upper part of nep-all.

  31. Slide 31 / 31 Conclusion Focused on observable system features Editing time Influences on report success Effort in creating an issue Summarize: The system supports the editor well in creating an issue A complete view requires a more user-centred observation. Future work: Why and under what conditions is a document relevant? NEP provides many opportunities for further research on data that is relatively easily available.

  32. Thank you! Questions?

Related


More Related Content