Orbis Dictus - From Lexical Statistical Computation to Natural Language Processing and Self Customisation

Nader Harb
Roma Tre University, Italy

Francesco Agrusti
Roma Tre University, Italy


One of the main objectives of the Italian national funded project Adaptive Message Learning (am-Learning) is to produce a system capable of understanding the learner’s lexical needs and providing him/her with the adapted study material that he/she will understand with the least help from a third party knowledge source (e.g. dictionaries, web, etc.). During the four years project, this result was obtained using lexical statistical computation techniques to measure relevant educational competences such as reading comprehension. This approach led us to manipulate the idea of “word frequency” as related to the difficulty of comprehension of a word in a certain document. In other words, the concept of frequency can be defined as how many times a word occurs in a huge collection of texts that fall under the same category (e.g. cardiology, physiotherapy, sociology, etc.) – called corpus. The higher the frequency the word holds, the higher the chance (probability) that the reader (student) already knows it (understanding its meaning in the document), and vice-versa.One of the products of the am-Learning project is an advanced e-learning platform (or LMS – learning management system) called Orbis Dictus. It is already working and it implements the above approach to deliver an automatically adapted e-learning materials and tests based on lexical statistical algorithms. This innovative platform is formed by three distinct technological tools, each of them devote to provide different LMS functionalities: the LexMeter module outlines an initial user profile assessing the learner’s characteristics in terms of his/her lexical competency; the ProgressMeter module creates short cloze tests to report and monitor the learner’s gradual improvement through the learning path; using the results obtained by the other two, the third module, called Adapter, automatically adjusts the text document (e.g. manuals) in accordance to the follow hypothesis: introducing more detailed explanation of low-frequency (hard) words helps students better understand the given text material. In other words, starting from a fixed text inserted by the course tutor, the study material is automatically integrated with definitions and explanations in order to provide the student with an already tuned text, matching his/her reading skill.Some enhancements were proposed to improve this approach taking advantage of the vast collection of words added daily to the web by its users, collecting new data with every second passing, in addition to building a system that will not need to be explicitly programmed every time new data is added, or every time the system makes a wrong judgement or a right one, but simply learns new knowledge from the newly collected data (Machine Learning).One more addition deserves mentioning is the natural language processing, in this field some studies and algorithms were adopted in order to help the system consider the articles and books not as a collection of single words, but as a collection of phrases, paragraphs that cover subject (Aboutness concept).

Full Text:



  • There are currently no refbacks.