Fatemeh Fahimnia, Fresheteh Montazeri,
Volume 1, Issue 2 (7-2014)
Abstract
Background and Aim: The present investigation was carried out in order to study the self-archiving behavior displayed by Knowledge and Information Sciences (KIS) faculty members in Iran. It intended to discover the incentives and barriers impacting on this behavior as well as arriving at a baseline for predicting the extent of self-archiving.
Method: A descriptive survey method was deployed. The population investigated, included all KIS faculty members affiliated with universities and research centers supervised by the Ministry of Science, Research and Technology in Iran.
Results: Based on self-reporting by the population studied, the extent of self-archiving is above average. Self-archiving in personal and corporate websites were more prevalent than institutional and subject repositories. Recognition component was the most important incentive and copyright consideration was the most important barrier to self-archiving by KIS faculty members. Among the 10 factors studied, only the professional recognition component was capable of predicting self-archiving of scientific output in open access websites.
Conclusion: KIS faculty members in Iran welcome open access of their scientific works but there is some obstacles such as copyright that removing it could help to improve current conditions.
Nosrat Riahinia, Farzaneh Shadanpour, Keyvan Borna, Gholam Ali Montazer,
Volume 9, Issue 3 (10-2022)
Abstract
Purpose: This study investigates the automatic keyword extraction from the table of contents of Persian e-books in the field of science using LDA topic modeling, evaluating their similarity with the golden standard, and users' viewpoints of the model keywords.
Methodology: This is mixed text-mining research in which LDA topic modeling is used to extract keywords from the table of contents of scientific e-books. The evaluation of the used approach has been done by two methods of cosine similarity computing and qualitative evaluation by users.
Findings: Table of contents are medium-length texts with a trimmed mean of 260.02 words, about 20% of which are stop-words. The cosine similarity between the golden standard keywords and the output keywords is 0.0932 thus very low. The full agreement of users showed that the extracted keywords with the LDA topic model represent the subject field of the whole corpus, but the golden standard keywords, the keywords extracted using the LDA topic model in sub-domains of the corpus, and the keywords extracted from the whole corpus were respectively successful in subject describing of each document.
Conclusion: The keywords extracted using the LDA topic model can be used in unspecified and unknown collections to extract hidden thematic content of the whole collection, but not to accurately relate each topic to each document in large and heterogeneous themes. In collections of texts in one subject field, such as mathematics or physics, etc., with less diversity and more uniformity in terms of the words used in them, more coherent and relevant keywords are obtained, but in these cases, the control of the relevance of keywords to each document is required. In formal subject analysis procedures and processes of individual documents, this approach can be used as a keyword suggestion system for indexing and analytical workforce.