This was a very interesting experience, and a lot easier than I expected! I wish I knew about that free virtual library before I bought the online copy of The Great Gatsby! This is one of my favorite books of all time, and I really enjoyed being able to analyze all of the data here.
Voyant Tools
This was a really easy to use experience. All you do is copy a .txt file into the processor, and it spits out all sorts of tables, word tables, and data visualization maps.
This is really interesting. Not only was this program fast, but it was strikingly accurate, providing lines in the text where every word came from, and an adjustable scale for the amount of words you want in your map. The only downside is that this program did not get rid of the text that was made from the Project Gutenberg library, explaining copyright and the Author/Release date.
Google Gemini
Here is an image that I generated with Google Gemini (I didn’t use the html block because it kept glitching my entire page so I put it in textedit and extracted it there)


Bellow, I have attached a few pieces of chat that I have used to generate the images above



Gemini really likes python, and it kept defaulting to giving me code in python even after I EXPLICITLY told it to give me HTML. Honestly the interface was really frustrating and no matter how hard I tried, I wasn’t able to produce a HTML file for the Word connection map and when I asked it to generate an image, as you can see, it went terribly. Also, when asked to give a word map containing 30 words, it gave me 23.
The code itself is pasted below
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<title>The Great Gatsby - Word Cloud</title>
<script src="https://cdnjs.cloudflare.com/ajax/libs/wordcloud2.js/1.2.2/wordcloud2.min.js"></script>
<style>
body {
font-family: 'Courier New', Courier, monospace;
background-color: #f4f4f4;
display: flex;
flex-direction: column;
align-items: center;
justify-content: center;
height: 100vh;
margin: 0;
}
h1 {
color: #2c3e50;
margin-bottom: 10px;
}
#canvas_container {
width: 800px;
height: 600px;
background-color: #fff;
box-shadow: 0 4px 8px rgba(0,0,0,0.1);
border-radius: 8px;
position: relative;
}
canvas {
display: block;
width: 100%;
height: 100%;
}
.footer {
margin-top: 20px;
color: #666;
font-size: 0.8rem;
}
</style>
</head>
<body>
<h1>The Great Gatsby: Top 30 Words</h1>
<div id="canvas_container">
<canvas id="word_cloud_canvas"></canvas>
</div>
<div class="footer">Distant Reading Analysis</div>
<script>
// Top 30 words from The Great Gatsby analysis
const wordList = [
['gatsby', 268],
['said', 235],
['tom', 193],
['daisy', 185],
['one', 153],
['man', 146],
['like', 124],
['didn\'t', 118],
['came', 108],
['back', 107],
['know', 104],
['little', 103],
['looked', 101],
['never', 98],
['went', 98],
['get', 94],
['eyes', 93],
['wilson', 86],
['house', 84],
['think', 83],
['time', 83],
['go', 81],
['night', 81],
['away', 80],
['car', 75],
['day', 74],
['room', 72],
['door', 69],
['saw', 68],
['people', 68]
];
const canvas = document.getElementById('word_cloud_canvas');
// Set canvas size to match container
canvas.width = 800;
canvas.height = 600;
const options = {
list: wordList,
gridSize: 10,
weightFactor: 0.8, // Adjusts the size of the words relative to the canvas
fontFamily: 'Georgia, serif',
color: function (word, weight) {
// Return a color from a "Jazz Age" inspired palette
const colors = ['#d4af37', '#2c3e50', '#c0392b', '#16a085', '#7f8c8d'];
return colors[Math.floor(Math.random() * colors.length)];
},
rotateRatio: 0.3, // Probability of word rotation
backgroundColor: '#ffffff'
};
WordCloud(canvas, options);
</script>
</body>
</html>
Discussion
In a scholarly setting, we strive to analyze texts the best we can to uncover hidden meaning and trends within the text. AI and other text analyzing programs are able to effectively analyze documents that are long and would normally be impossible to break apart. This is a huge development in scholarly work. We can make comparisons across novels, themes and genres that would normally take long periods of time to uncover. However, data visualization is a tool not an answer. AI especially lacks a large amount of ability to detect nuance and writer intention, leading to a lot of faulty analysis and response. Tools like Voyant tools help by only giving data and visualization and not trying to make inferences about the text. While AI (while worse than voyant tools) can be very useful for analyzing trends it should not be used in a way that makes inferences and assumptions about the learners intention.
In the “Age of AI” we need to be cautious of the fact that AI WILL cut corners wherever it can. It finds a very linear path to point A and B and wont care about any user friendliness unless specifically asked for it. While AI has come a long way since 2022, it still obviously does not have the ability to analyze text that exceeds the project that someone built. The terrifying part about this, is that some people already trust AI to take over all of these human responsibilities. While it is a great tool, as of right now we still need to be mindful and check that what we put out into the world is verified by humans and not just a machine ridden with errors. In my mind after this experience, I believe that using a tool or AI to analyze the data is best, not to analyze deeper themes or make intense graphics to prove a point.


I was just commenting on Jackson Reade’s post about how it would be interesting to see a gemini generated word cloud and you did just that Zander! It’s interesting that Gemini likes Python so much and also interesting how frustrating to work with it was for you. AI is always seen as a convenient tool to make things easier but Voyant seems to be the much better option in this case.