{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# 朴素贝叶斯模型实验 (选作)\n", "\n", "> 本实验目标是用朴素贝叶斯模型对Yelp网站的评论文本进行分类" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 第一步:读入数据\n", "\n", "把`yelp.csv`读入一个DataFrame中。" ] }, { "cell_type": "code", "execution_count": 66, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", " | business_id | \n", "date | \n", "review_id | \n", "stars | \n", "text | \n", "type | \n", "user_id | \n", "cool | \n", "useful | \n", "funny | \n", "
---|---|---|---|---|---|---|---|---|---|---|
0 | \n", "9yKzy9PApeiPPOUJEtnvkg | \n", "2011-01-26 | \n", "fWKvX83p0-ka4JS3dc6E5A | \n", "5 | \n", "My wife took me here on my birthday for breakf... | \n", "review | \n", "rLtl8ZkDX5vH5nAx9C3q5Q | \n", "2 | \n", "5 | \n", "0 | \n", "
1 | \n", "ZRJwVLyzEJq1VAihDhYiow | \n", "2011-07-27 | \n", "IjZ33sJrzXqU-0X6U8NwyA | \n", "5 | \n", "I have no idea why some people give bad review... | \n", "review | \n", "0a2KyEL0d3Yb1V6aivbIuQ | \n", "0 | \n", "0 | \n", "0 | \n", "
2 | \n", "6oRAC4uyJCsJl1X0WZpVSA | \n", "2012-06-14 | \n", "IESLBzqUCLdSzSqm0eCSxQ | \n", "4 | \n", "love the gyro plate. Rice is so good and I als... | \n", "review | \n", "0hT2KtfLiobPvh6cDC8JQg | \n", "0 | \n", "1 | \n", "0 | \n", "
3 | \n", "_1QQZuf4zZOyFCvXc0o6Vg | \n", "2010-05-27 | \n", "G-WvGaISbqqaMHlNnByodA | \n", "5 | \n", "Rosie, Dakota, and I LOVE Chaparral Dog Park!!... | \n", "review | \n", "uZetl9T0NcROGOyFfughhg | \n", "1 | \n", "2 | \n", "0 | \n", "
4 | \n", "6ozycU1RpktNG2-1BroVtw | \n", "2012-01-05 | \n", "1uJFq2r5QfJG_6ExMRCaGw | \n", "5 | \n", "General Manager Scott Petello is a good egg!!!... | \n", "review | \n", "vYmM4KTsC8ZfQBg-j5MWkw | \n", "0 | \n", "0 | \n", "0 | \n", "
MultinomialNB()In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
MultinomialNB()