Candide is a community of plant lovers. Often our users will come to our feed and ask questions about their plants.
Last month, we launched our Q&A Bot: a system that tries to answer any question that the user asks, using the wisdom from our articles and wider content in the app.
I ran an analysis of the content of the questions we have received and created a cloud of words for visualisation, using Python’s WordCloud package.
So, armed with around 2,500 questions, submitted to us via the Q&A Bot by just over 1,000 Candide users, let us dive in!
First, we import the libraries we will be using: numpy, pandas, wordcloud and matplotlib.pyplot.
1import numpy as np
2import pandas as pd
3from wordcloud import WordCloud, STOPWORDS
5import matplotlib.pyplot as plt
Then, we load our data into a pandas dataframe.
2df = pd.read_csv("questions.csv", index_col=0)
Next, we combine all of the submitted questions into one text, and make sure that it is all in lower case, like so:
2text = " ".join(question for question in df.text)
4text = text.lower()
We are now ready to use cloudword and matplotlib to generate and plot our cloud of words, as follows:
2stopwords = STOPWORDS
3wordcloud = WordCloud(background_color="white", max_words=1000, stopwords=stopwords)
Et voilà! The result looks like this:
Word Cloud, using questions submitted to our Q&A Bot
Top words are plant, flower and leaves 🌱. No surprises there 🙂!
Now, if we would like to personalise the output, we may want to have this cloud of words take a certain shape. For this, we will use the following silhouette of a robot, kindly provided by my designer colleague, Kat (it is a Q&A Bot after all!)
Using the above image, and the following code:
2path = "/Users/amine/code/HackDayMay2020/robot-plant.png"
3mask = np.array(Image.open(path))
4wordcloud = WordCloud(background_color="white", max_words=1000, mask=mask, stopwords=stopwords)
we get a final result of a cloud of words in the shape of our robot!
Finally, we shouldn’t forget to save our image to file, with this simple line of code:
In the next tutorial, I will be analysing how our users’ questions change with time and seasonality.
References & resources
- WordCloud for Python documentation
- Github code of the WordCloud project
- Generating WordClouds in Python by Duong Vu
- Generating Word Cloud in Python by SumedhKadam