Choose a country to see content specific to your location

Skip to main content

Our Q&A Bot & WordCloud in Python

Amine.Chakhchoukh
Published on May 20th 2020
A close up of a map
Candide is a community of plant lovers. Often our users will come to our feed and ask questions about their plants.
Last month, we launched our Q&A Bot: a system that tries to answer any question that the user asks, using the wisdom from our articles and wider content in the app.
I ran an analysis of the content of the questions we have received and created a cloud of words for visualisation, using Python’s WordCloud package.
So, armed with around 2,500 questions, submitted to us via the Q&A Bot by just over 1,000 Candide users, let us dive in!
First, we import the libraries we will be using: numpy, pandas, wordcloud and matplotlib.pyplot.
1import numpy as np
2import pandas as pd
3from wordcloud import WordCloud, STOPWORDS
4
5import matplotlib.pyplot as plt
Then, we load our data into a pandas dataframe.
1# Dataframed
2df = pd.read_csv("questions.csv", index_col=0)
Next, we combine all of the submitted questions into one text, and make sure that it is all in lower case, like so:
1# Combine all questions into one text
2text = " ".join(question for question in df.text)
3# Make sure all words are lower case
4text = text.lower()
We are now ready to use cloudword and matplotlib to generate and plot our cloud of words, as follows:
1# Generate a wordcloud
2stopwords = STOPWORDS
3wordcloud = WordCloud(background_color="white", max_words=1000, stopwords=stopwords)
4wordcloud.generate(text)
5
6# Plot
7plt.figure(figsize=[20, 10])
8plt.imshow(wordcloud, interpolation='bilinear')
9plt.axis("off")
10plt.show()
Et voilà! The result looks like this:
Word Cloud, using questions submitted to our Q&A Bot
Word Cloud, using questions submitted to our Q&A Bot
Top words are plant, flower and leaves 🌱. No surprises there 🙂!
Now, if we would like to personalise the output, we may want to have this cloud of words take a certain shape. For this, we will use the following silhouette of a robot, kindly provided by my designer colleague, Kat (it is a Q&A Bot after all!)
A close up of a light
Using the above image, and the following code:
1# Personalised cloud of words
2path = "/Users/amine/code/HackDayMay2020/robot-plant.png"
3mask = np.array(Image.open(path))
4wordcloud = WordCloud(background_color="white", max_words=1000, mask=mask, stopwords=stopwords)
5wordcloud.generate(text)
6
7# Plot
8plt.figure(figsize=[20, 10])
9plt.imshow(wordcloud, interpolation='bilinear')
10plt.axis("off")
11plt.show()
we get a final result of a cloud of words in the shape of our robot!
A close up of text on a white background
Finally, we shouldn’t forget to save our image to file, with this simple line of code:
1# Save to file
2wordcloud.to_file("robot-word-cloud.png")
In the next tutorial, I will be analysing how our users’ questions change with time and seasonality.

References & resources

  1. WordCloud for Python documentation
  2. Github code of the WordCloud project
  3. Generating WordClouds in Python by Duong Vu
  4. Generating Word Cloud in Python by SumedhKadam