Experiments

Python Notebooks

The jupyter notebooks are described as follows:

AES_MLP_BERT (9/10/2022)

We compare results from 1D-CNN, 2D-CNN, and MLP training models with TF hub BERT APIs.

Baseline-w-CSE (7/17/2022)

Baseline training models for different experiments for the first stage (Semantic/Coherence/Prompt-relevance) with the spelling errors corrected from the essays to improve BERT and training model performance. Experiments are as follows:

  • Semantic Model
    • with 2nd Last Hidden State (2LHS) BERT Embeddedings
    • with sum of Last 4 Hidden States (L4HS) BERT Embeddings
    • L4HS + Sentence Average BERT Embeddings
    • L4HS + Pooled BERT Embeddings
  • Coherence Model
    • with 2nd Last Hidden State (2LHS) BERT Embeddedings
    • with sum of Last 4 Hidden States (L4HS) BERT Embeddings
    • L4HS + Sentence Average BERT Embeddings
    • L4HS + Pooled BERT Embeddings
  • Coherence Model - Next Sentence Prediction (NSP): Using the output from BERT model fine-tuned for next sentence prediction to evaluate local and global average sentence coherency. We perform a direct evaluation of the two models on an unseen IELTS dataset.

  • Prompt Relevance Model
    • with 2nd Last Hidden State (2LHS) BERT Embeddedings
    • with sum of Last 4 Hidden States (L4HS) BERT Embeddings
    • L4HS + Sentence Average BERT Embeddings
    • L4HS + Pooled BERT Embeddings
  • Prompt Relevance Model - Cosine Similarity (COSIM): We explore evaluating the prompt relevance using cosine similarity of pooled essay embedding and the prompt embedding. We further perform a direct evaluation of the two models on an unseen IELTS dataset.

AES-baseline (7/17/2022)

Baseline training models for different experiments for the first stage (Semantic/Coherence/Prompt-relevance) Experiments are as follows:

  • Semantic Model
    • with 2nd Last Hidden State (2LHS) BERT Embeddedings
    • with sum of Last 4 Hidden States (L4HS) BERT Embeddings
    • L4HS + Sentence Average BERT Embeddings
    • L4HS + Pooled BERT Embeddings
  • **Coherence Model **
    • with 2nd Last Hidden State (2LHS) BERT Embeddedings
    • with sum of Last 4 Hidden States (L4HS) BERT Embeddings
    • L4HS + Sentence Average BERT Embeddings
    • L4HS + Pooled BERT Embeddings
  • Prompt Relevance Model
    • with 2nd Last Hidden State (2LHS) BERT Embeddedings
    • with sum of Last 4 Hidden States (L4HS) BERT Embeddings
    • L4HS + Sentence Average BERT Embeddings
    • L4HS + Pooled BERT Embeddings

2nd Stage Model Selection: We explore different tree-based models (including XGBoost and RF) for 2nd stage regression model selection.

GitHub Repository

https://github.com/agastyaseth/aes-two-stage

Preprint Article

https://github.com/agastyaseth/aes-two-stage/blob/main/AES-Paper-Feb2022-v1.0.pdf