Machine Learning

Data Science 101 (Getting started in NLP): Tokenization tutorial

Tagged: , , ,

This topic contains 0 replies, has 1 voice, and was last updated by  kaggle 1 year, 8 months ago.

  • Data Science 101 (Getting started in NLP): Tokenization tutorial

    One common task in NLP (Natural Language Processing) is tokenization. “Tokens” are usually individual words (at least in languages like English) and “tokenization” is taking a text or set of text and breaking it up into its individual words. These tokens are then used as the input for other types of analysis or tasks, like parsing (automatically tagging the syntactic relationship between words). In this tutorial you’ll learn how to: Read text into R Select only certain lines Tokenize text …

    Data Science 101 (Getting started in NLP): Tokenization tutorial
    by Rachael Tatman

    Data Science 101 (Getting started in NLP): Tokenization tutorial

You must be logged in to reply to this topic.