: Generally recommended unless you are performing Named Entity Recognition (NER).
: Removal of HTML tags, metadata, and special characters.
The "AU" designation signifies [1]. The "Clean" suffix indicates that the raw data (often scraped from Australian news sites, social media, or government records) has undergone several cleaning steps: