今天的这篇文章源自于EMNLP 2021 Findings,论文标题为《AEDA: An Easier Data Augmentation Technique for Text Classification》。实际上用一句话即可总结全文:对于文本分类任务来说,在句子中插入一些标点符号是最强的数据扩增方法
begin{array}{cc} hline textbf{Original} & text{a sad , superior human comedy played out on the back roads of life .} \ hline textbf{Aug 1} & text{a sad , superior human comedy played out on the back roads ; of life ; .}\ hline textbf{Aug 2} & text{a , sad . , superior human ; comedy . played . out on the back roads of life .}\ hline textbf{Aug 3} & text{: a sad ; , superior ! human : comedy , played out ? on the back roads of life .}\ hline end{array}