{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "## Imports" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import pandas as pd\n", "import os" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Configuration\n", "\n", "*seed_words_filename:* The complete path to the input seed words list. For example: `path/to/seed_words/seed_words.txt`.\n", "\n", "*output_dir:* The path to the directory where you want to save the created annotation files. Please make sure to use a '/' (slash) in the end. For example: `path/to/output/`." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "seed_words_filename = \"results/raw/seed_words.txt\"\n", "output_dir = \"results/annotated/\"" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Directory Setup (Optional)\n", "Creates directories according to the configuration if not already created manually." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "if not os.path.exists(output_dir):\n", " os.makedirs(output_dir)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Seed Word Annotation" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Load seed words" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "with open(\"{}\".format(seed_words_filename), \"r\", encoding=\"utf-8\") as inputfile:\n", " seed_words = [line.rstrip() for line in inputfile]\n", "print(\"loaded {} seed words\".format(len(seed_words)))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Create annotation file" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "print(\"enter name of annotator: \")\n", "annotator = input()\n", "\n", "annotation_df = pd.DataFrame(index=seed_words, columns=[\"sentiment\"])\n", "annotation_df.index.name = \"word\"\n", "annotation_df.to_csv(\"{}{}_seed_words.csv\".format(output_dir, annotator.lower()))\n", "\n", "print(\"set up annotation file for: {}\".format(annotator))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Annotate seed words\n", "Please open the created annotation files (.csv files) with a spreadsheet program of your choice (e.g., Excel or LibreOffice Calc) and annotate the seed words.\n", "Make sure you use either of the following sentiment classes:\n", "\n", "* positive\n", "* negative\n", "* neutral\n", "\n", "Example:\n", "\n", "| word | sentiment |\n", "| --- | --- |\n", "| good | positive |\n", "| bad | negative |\n", "| house | neutral |\n", "\n", "Once you are finished, make sure to save the file using the **.csv** extension.\n", "\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.8.5" } }, "nbformat": 4, "nbformat_minor": 2 }