Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

 

Some features I like

 

  1. (in a long line of LiWC-like lexicons) Chenhao Tan's list of hedging phrases

    Expand
    titlelist of hedges (phrases and regular expressions)

     

    (some as regular expressions [README, list itself]

  2. Part-of-speech n-grams
  3. Language models on the most frequent words only
    1. Distinctiveness
  4. Language models on the content words
  5. Distributional similarity

... and one feature that I both like and drives me crazy: token length

What does this mean in the age of deep learning, where we don't need to worry about features anymore?

  1. BERT vs hand features, controversy paper
  2. Word embeddings
    1. BERT - word pieces!
  3. Language modeling

 

 

Expand

test

Expand
title(2) Click here to expand...

hidden

less hidden

...