Commit 52aa0398 authored by Niels-Oliver Walkowski's avatar Niels-Oliver Walkowski
Browse files

bug(vdhd.2): Remove redundant citation

parent 54de1b2e
......@@ -15,20 +15,6 @@
# Abstract
title = {“{{Embed}}, Embed! {{There}}’s Knocking at the Gate.” - {{Detecting Intertextuality}} with {{Embeddings}} and the {{Vectorian}}},
booktitle = {Fabrikation von {{Erkenntnis}}. {{Experimente}} in Den {{Digital Humanities}}},
author = {Liebl, Bernhard and Burghardt, Manuel},
date = {2022-01-15},
publisher = {{Melusina Press}},
location = {{Esch-sur-Alzette}},
url = {},
isbn = {978-2-919815-25-8},
langid = {english}
*Bernhard Liebl & Manuel Burghardt, Computational Humanities Group, Leipzig University*
The detection of intertextual references in text corpora is a digital humanities topic that has gained a lot of attention in recent years. While intertextuality – from a literary studies perspective – describes the phenomenon of one text being present in another text, the computational problem at hand is the task of text similarity detection, and more concretely, semantic similarity detection. In this notebook, we introduce the Vectorian as a framework to build queries through word embeddings such as fastText and GloVe. We evaluate the influence of computing document similarity through alignments such as Waterman-Smith-Beyer and two variants of Word Mover’s Distance. We also investigate the performance of state-of-art sentence embeddings like Siamese BERT networks for the task - both as document embeddings and as contextual token embeddings. Overall, we find that Waterman-Smith-Beyer with fastText offers highly competitive performance. The notebook can also be used to upload new data for performing custom search queries.
Supports Markdown
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment