Suggested by: Bernd Resch (bernd.resch@sbg.ac.at), and Nikolaus Augsten (nikolaus.augsten@sbg.ac.at)
Short description: Computing similarities between Tweets is of
crucial importance for a number of application areas like disaster management,
urban planning, or fight against crime and terrorism. However in contrast to most
previous natural language processing (NLP) approaches, which focused purely on
textual content, the approach addressed in this master’s thesis implicitly
considers the temporal and spatial dimensions, which carry vital information.
This thesis builds on existing research, which developed an interdisciplinary
method for emotion classification that combines linguistic, temporal, and
spatial information into a single similarity metric and subsequently applies a
graph-based semi-supervised learning approach to labels all tweets with an
emotion class. The main goal of this thesis is to improve the current
algorithm: 1.) by increasing the efficiency through the development of a new tweet
labelling algorithm, and 2.) by validating the definition of linguistic, spatial
and temporal similarity parameters.
The master thesis
will be carried out together with University of Salzburg's Computer Science Department.
Literature:
Resch, B., Summa, A., Zeile, P. and Strube, M. (2016)
Citizen-centric Urban Planning through Extracting Emotion Information from
Twitter in an Interdisciplinary Space-Time-Linguistics Algorithm. Urban
Planning, 1(2), pp. 114-127.
W. Mann, N. Augsten, P. Bouros. An Empirical
Evaluation of Set Similarity Join Techniques. In The Proceedings
of the VLDB Endowment (PVLDB 2016)
Pak, Alexander and Patrick Paroubek (2010). “Twitter
as a Corpus for Sentiment Analysis and Opinion Mining”. In: Proceedings of the
Seventh International Conference on Language Resources and Evaluation
(LREC’10). Ed. by Nicoletta Calzolari (Conference Chair), Khalid Choukri, Bente
Maegaard, Joseph Mariani, Jan Odijk, Stelios Piperidis, Mike Rosner, and Daniel
Tapias. Valletta, Malta: European Language Resources
Association (ELRA), pp. 1320–1326.
Abney, Steven (2008). Semisupervised Learning for
Computational Linguistics. Ed. By David Madigan, Fionn Murtagh, and Padhraic
Smyth. London: Chapman & Hall/CRC.
Start date: ASAP
Prerequisites/qualifications: interest in interdisciplinary/applied research,
preferably experience with algorithms, text mining, similarity computation and/or
Tweet analysis; programming skills (Java) are required
No comments:
Post a Comment