Text Cleaning

Before tokenize the posts and feed them into text classification algorithms, we need to tidy up the original posts first. Text cleaning is task-oriented, for this project, we have two main tasks: 1. prepare posts for human readability on crowdsourced labeling webpage; 2. prepare posts for machine learning algorithms. The language we use to cleanContinue reading “Text Cleaning”