This event has ended. Visit the official site or create your own event on Sched.
Please visit our website to register for the conference!
Back To Schedule
Saturday, June 23 • 11:00am - 12:30pm
Session 2: Data mining and knowledge discovery

Sign up or log in to save this to your schedule, view media, leave feedback and see who's attending!

Google Docs Session Notes

An expertise recommender system based on data from an Institutional Repository (DIVA)
Authors: Milena Angelova, Technical University of Sofia; Vishnu Manasa Devagiri, Blekinge Institute of Technology; Veselka Boeva, Blekinge Institute of Technology; Peter Linde, Blekinge Institute of Technology; Niklas Lavesson, Blekinge Institute of Technology
Finding experts in academics is an important practical problem, e.g. recruiting reviewers for reviewing conference, journal or project submissions, partner matching for research proposals, finding relevant M. Sc. or Ph. D. supervisors etc. In this work, we discuss an expertise recommender system that is built on data extracted from the Blekinge Institute of Technology (BTH) instance of the institutional repository system DiVA. The developed prototype system is evaluated and validated on information extracted from the BTH DiVA installation, concerning thesis supervision of researchers affiliated with BTH. The extracted DiVA classification terms are used to build an ontology that conceptualizes the thesis domain supported by the university. The supervisor profiles of the tutors affiliated with the BTH are constructed based on the extracted DiVA data. These profiles can further be used to identify and recommend relevant subject thesis supervisors.

Automatic subject indexing and classification using text recognition and computer-based analysis of tables of contents
Author: Jan Pokorný, National Library of Technology, Prague
This paper will describe a method for machine-based creation of high quality subject indexing and classification for both electronic and print documents using tables of contents (ToCs). The technology described here is primarily focused on electronic and print documents for which, because of technical or licensing reasons, it is not possible to index full text. However, the technology would also be useful for full text documents, because it could significantly enhance the accuracy and relevance of subject description by analyzing the structure of ToCs.

Availability of cultural heritage structured metadata in the World Wide Web
Authors:  Nuno Freire, INESC-ID; Pável Calado INESC-ID, IST, University of Lisbon; Bruno Martins INESC-ID, IST, University of Lisbon
In the World Wide Web, a very large number of resources is made available through digital libraries. The existence of many individual digital libraries, maintained by different organizations, brings challenges to the discoverability, sharing and reuse of the resources. A widely-used approach is metadata aggregation, where centralized efforts like Europeana facilitate the discoverability and use of the resources by collecting their associated metadata. The cultural heritage domain embraced the aggregation approach while, at the same time, the technological landscape kept evolving. Nowadays, cultural heritage institutions are increasingly applying technologies designed for the wider interoperability on the Web. This paper presents a study of the current application by cultural heritage data providers of technological solutions in use for making structured metadata available for re-use in the Internet. We investigated the use of both linked data and technologies related with indexing of resources by Internet search engines. We have conducted a harvesting experiment of the landing pages from websites of digital libraries that participate in Europeana, and collected statistics about the usage these particular technologies. These technologies allow for representing structured data within HTML, or for structured data to be referred to by links within HTML or through HTTP headers capabilities. We conclude with a discussion of future work for establishing a solution for cultural heritage aggregation based on the current situation and the available technologies.

Publication of data derived from patient authored and curated private personal healthcare data
Authors: Peter Pennefather, gDial Inc. and Dan Faculty of Pharmacy, University of Toronto; West Suhanic, gDial Inc.; Fatima Lakha, Inclusive Media and Design Center, Ryerson University; Deborah I. Fels, Inclusive Media and Design Center, Ryerson University
An inclusive systemic design is specified to publish data derived from personal data authored and curated by patients. The use case is care for medically significant pain and distress and multi-purpose analysis of data derived from unstructured patient reports about their experiences with that medical care. The design specifies how to store and access derived data through distributed ledgers that support qualitative and quantitative analysis by diverse users. It allows patients to author and curate their private data and enables polycentric governance over publication and analysis of the common pool resource of data derived from that private healthcare related data.

avatar for Raed Sharif

Raed Sharif

Senior Program Officer, International Development Research Centre Canada


Veselka Boeva

Blekinge Institute of Technology
avatar for Nuno Freire

Nuno Freire

Universidade de Lisboa

Peter Linde

Blekinge Institute of Technology
avatar for Peter Pennefather

Peter Pennefather

professor emeritus, gDial Inc
My wife Yvonne and I started rowing in 2004 as cross training for winter speed skating. We have been recreational members of Hanlan since 2007. As a recreational rower, I value the sport of rowing as an opportunity to interact with people and nature around me. I appreciate it as a... Read More →

Jan Pokorný

National Library of Technology, Prague

Saturday June 23, 2018 11:00am - 12:30pm EDT
Room 538, Faculty of Information