Commit 44f597f1 authored by D.H.D. Nguyen's avatar D.H.D. Nguyen
Browse files

aktuelle Version

parent b69718ba
Loading
Loading
Loading
Loading

README.md

0 → 100755
+207 −0
Original line number Diff line number Diff line
# Web Interface for Best-Worst-Scaling

(authored by Dung Nguyen, Maryna Charniuk, Sanaz Safdel, 2019-2020)

This project aims at creating a user-friendly website to annotate data 
using **Best-Worst-Scaling** ([Kiritchenko and Mohammad 2016](https://saifmohammad.com/WebPages/BestWorst.html)).

## Requirements

* [Python3.6](https://www.python.org/downloads/release/python-369/) or later
* [Flask](https://flask.palletsprojects.com/)
* [Flask-Bootstrap](https://pythonhosted.org/Flask-Bootstrap/)
* [Flask-Login](https://flask-login.readthedocs.io/en/latest/)
* [Flask-WTF](https://flask-wtf.readthedocs.io/en/stable/)
* [Flask-SQLAlchemy](https://flask-sqlalchemy.palletsprojects.com/en/2.x/)
* [Boto3](https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/mturk.html)
* [Pytest](https://docs.pytest.org/en/latest/)
* [Sphinx](https://www.sphinx-doc.org/en/master/)

## Installation
* Create a virtual environment using [venv](https://docs.python.org/3/library/venv.html)
 or [virtualenv](https://virtualenv.pypa.io/en/latest/) to manage dependencies for this repository. (*recommended*)
* Clone the repository:
```sh
$ git clone https://gitlab.cl.uni-heidelberg.de/nguyen/swp.git
$ cd swp/
```

* After activating the virtual environment, run:
```sh
$ pip install -r requirements.txt
```
to install requirements for this project.

## How to

### 1. Web Application
In ```swp/``` run:
```sh
$ python main.py
```
The system works locally. Open this [URL](http://127.0.0.1:5000/ "Local development system") in any browser to access to the web application.

#### § Structure
    
In the following scheme is the structure of this directory for the web application :
```bash   
swp/
├── README.md
├── __init__.py
├── config.py
├── doc
│   └── ...
├── examples
│   ├── first_10_characters_examples.txt
│   └── movie_reviews_examples.txt
├── index.rst
├── main.py
├── project
│   ├── __init__.py
│   ├── annotator
│   │   ├── __init__.py
│   │   ├── account.py
│   │   ├── annotation.py
│   │   ├── forms.py
│   │   ├── helpers.py
│   │   └── views.py
│   ├── generator.py
│   ├── models.py
│   ├── start
│   │   ├── __init__.py
│   │   └── routes.py
│   ├── static
│   │   └── styles.css
│   ├── templates
│   │   ├── annotator
│   │   │   ├── batch.html
│   │   │   ├── index.html
│   │   │   └── project.html
│   │   ├── questions.xml
│   │   ├── start.html
│   │   └── user
│   │       ├── index.html
│   │       ├── login.html
│   │       ├── profile.html
│   │       ├── project.html
│   │       ├── signup.html
│   │       └── upload-project.html
│   ├── user
│   │   ├── __init__.py
│   │   ├── account.py
│   │   ├── forms.py
│   │   ├── helpers.py
│   │   ├── home.py
│   │   ├── inputs.py
│   │   ├── outputs.py
│   │   └── views.py
│   └── validators.py
├── requirements.txt
└── tests
   └── ...
```

#### § Short User Manual
* In order to upload a project, you need an account first. Then, follow the instructions on the website.
* For the project, upload only non-empty **_.txt_**-files. 
* There are 2 options how the annotation works:
	* Option 1: Local annotator system - You should find the annotators yourself. 
	* Option 2: ***Mechanical Turk*** - The project will be created on Amazon 
Crowdsourcing Platform - [Mechanical Turk](https://www.mturk.com/) as HITs. 
People that are interested in the HITS will accept and do the annotations. 
You don't need to find any annotator.
* At any time (when at least one annotator has submitted any batch), 2 files can be downloaded:
	* *scores.txt* : calculated scores of the items
	* *report.txt* : report with raw annotated datas 
    
### 2. Testing
To run test:
```sh
$ pytest
```

#### $ Structure
```sh
swp/
├── ...
├── ...
│	....
└── tests
    ├── __init__.py
    ├── conftest.py
    ├── functional
    │   ├── __init__.py
    │   ├── test_annotators.py
    │   ├── test_batches.py
    │   ├── test_projects.py
    │   ├── test_users.py
    │   └── test_wrong_cases_input_required.py
    └── unit
        ├── __init__.py
        ├── test_generator.py
        └── test_models.py
```

#### § Tests
##### 1. Unit Tests
* Test creating and saving data in any table, test relationships between tables
* Test adding uploaded items, creating tuples, creating batches
	+ Every uploaded item must be included.
    + Every item must be divided in at least one tuple.
   	+ Items must appear relatively in the same number of tuples: 2 conditions
    	1. Most of the items have the frequency in range (`average frequency - 2`, 
        `average frequency + 3` ). This happens because creating tuples from 
        source code is basically based on randomization and shuffling.
        + `Max frequency` and `min frequency` are in range `± 5` of `average frequency`.
    + Batches must be relatively equally divided: 2 cases
        + *Case 1* : for all batches:  `normal batch size``batch size``normal batch size + (minimum batch size - 1)`.
         **E.g.**: `normal batch size` = 20, `minimum batch size` = 5 => 20 ≤ `average batch size` ≤ 24. 
		+ *Case 2* : Accept only one batch that:  `minimum batch size``batch size` < `normal batch size`
        and the rest has the size of `normal batch size`. **E.g.**: `normal batch size` = 20, `minimum batch size` = 5,
        
        
##### 2. Functional Tests
* Test validations in user registration
    * Username, email are never used before.
    * Username has no special character, meets the length requirement.
    * Email must have email format.
    * Password must meet the length requirement.
* Test validations in user login
    * Not signed up username returns error.
    * Invalid password for valid username is not accepted. 
* Test validations in uploading a project
    * There must exist at least one non-empty `txt`-file.
    * At least 5 uploaded items for the project.
    * Project description must be long enough (at least 20 characters long).
    * **Best** and **Worst** definition are not the same.
* Test validation in annotator login
	* If keyword is already used, the pseudoname must correspond to given pseudoname before. (No 2 annotators have the same keyword)
* Test validations in annotating a batch
	* Every field is required.
    * In a tuple, an item is not allowed to be chosen as both **Best** and **Worst**.

> **Note**: No validation of required inputs for form attributes defined as 
`MultipleFileField`, `StringField`, `PasswordField` or `TextAreaField` from module
[wtforms.fields](https://wtforms.readthedocs.io/en/stable/fields.html) in this project.
> **Reason**: Validator `InputRequired` used from module 
[wtforms.validators](https://wtforms.readthedocs.io/en/stable/validators.html#wtforms.validators.InputRequired) 
can validate this requirement directly on web server but during testing in backend, 
those fields are misinterpreted (due to [source codes](https://github.com/wtforms/wtforms/blob/master/src/wtforms/fields/core.py)). 
More information see cases in `tests/functional/test_wrong_cases_input_required.py` 

### 3. Documentation
* To read the documentation, run:
```sh

```



## Additional Resource
* Bryan K. Orme.  *Maxdiff analysis : Simple counting , individual-level logit, and
hb*.  2009. [URL](https://www.sawtoothsoftware.com/download/techpap/indivmaxdiff.pdf)
* Saif Mohammad and Peter D. Turney. *Crowdsourcing a word-emotion association 
lexicon*. CoRR, abs/1308.6297, 2013. [URL](http://arxiv.org/abs/1308.6297).
* Svetlana Kiritchenko and Saif M. Mohammad.  *Best-worst scaling more reliable than 
rating scales: A case study on sentiment intensity annotation*. CoRR,
abs/1712.01765, 2017. [URL](http://arxiv.org/abs/1712.01765)

__init__.py

0 → 100755
+0 −0

Empty file added.

config.py

0 → 100755
+55 −0
Original line number Diff line number Diff line
# -*- coding: utf-8 -*-
"""

Module ``config``
********************

This module defines different Config objects for different servers.

"""

import os
basedir = os.path.abspath(os.path.dirname(__file__))


class Config:
	"""
	Base Configurations used for all servers.
	"""
	SECRET_KEY = os.environ.get('SECRET_KEY') or 'Thisissupposedtobesecret!'
	SQLALCHEMY_TRACK_MODIFICATIONS = False
	BASE_DIR = basedir
	SQLALCHEMY_DATABASE_URI = 'sqlite:///%s'%(os.path.join(basedir, 'database.db'))

	@staticmethod
	def init_app(app):
		pass

class DevelopmentConfig(Config):
	"""
	Configurations used during development.
	"""
	FLASK_ENV = 'development'
	DEBUG = True
	SQLALCHEMY_DATABASE_URI = os.environ.get('DEV_DATABASE_URL') or \
								 'sqlite:///%s'%(os.path.join(basedir, 'database-dev.db'))
	MTURK_URL = 'https://mturk-requester-sandbox.us-east-1.amazonaws.com' # in production mode this will be NONE!!
	AWS_ACCESS_KEY_ID = None#"AKIAI57NPWPWYHGOWIPQ"
	AWS_SECRET_ACCESS_KEY = None#"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"
	MTURK_SHOW_UP_URL = "https://workersandbox.mturk.com/"

class TestingConfig(Config):
	"""
	Configurations used during testing.
	"""
	DEBUG = False
	TESTING = True
	SQLALCHEMY_DATABASE_URI = os.environ.get('TEST_DATABASE_URL') or \
								 'sqlite:///%s'%(os.path.join(basedir, 'database-test.db'))
	WTF_CSRF_ENABLED = False

config = {
	'development': DevelopmentConfig,
	'testing': TestingConfig,
	'default': DevelopmentConfig
}
 No newline at end of file

doc/Makefile

0 → 100644
+20 −0
Original line number Diff line number Diff line
# Minimal makefile for Sphinx documentation
#

# You can set these variables from the command line.
SPHINXOPTS    =
SPHINXBUILD   = sphinx-build
SPHINXPROJ    = BWS
SOURCEDIR     = source
BUILDDIR      = build

# Put it first so that "make" without argument is like "make help".
help:
	@$(SPHINXBUILD) -M help "$(SOURCEDIR)" "$(BUILDDIR)" $(SPHINXOPTS) $(O)

.PHONY: help Makefile

# Catch-all target: route all unknown targets to Sphinx using the new
# "make mode" option.  $(O) is meant as a shortcut for $(SPHINXOPTS).
%: Makefile
	@$(SPHINXBUILD) -M $@ "$(SOURCEDIR)" "$(BUILDDIR)" $(SPHINXOPTS) $(O)
 No newline at end of file
+3.11 KiB

File added.

No diff preview for this file type.

Loading