0% found this document useful (0 votes)
35 views8 pages

2025.chum-1.8

This paper evaluates the funniness of jokes generated by the AI system Witscript compared to those written by a professional human joke writer, using audience laughter as a measure. The findings indicate that the AI-generated jokes elicited laughter on par with human-crafted jokes, suggesting that AI can now produce original humor effectively. The study highlights the limitations of traditional numerical ratings for humor evaluation and proposes laughter measurement as a more reliable method.

Uploaded by

Kashcool
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
35 views8 pages

2025.chum-1.8

This paper evaluates the funniness of jokes generated by the AI system Witscript compared to those written by a professional human joke writer, using audience laughter as a measure. The findings indicate that the AI-generated jokes elicited laughter on par with human-crafted jokes, suggesting that AI can now produce original humor effectively. The study highlights the limitations of traditional numerical ratings for humor evaluation and proposes laughter measurement as a more reliable method.

Uploaded by

Kashcool
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 8

Can AI Make Us Laugh?

Comparing Jokes Generated by Witscript


and a Human Expert
Joe Toplyn1, Ori Amir2
1
Twenty Lane Media, LLC; 2HaHator LLC & Pomona College
[email protected], [email protected]

Abstract Witscript system has been improved. This paper


puts the current system to a more challenging test,
Evaluating the effectiveness of a joke- comparing the funniness of its jokes to the
generating AI system ultimately comes funniness of jokes written by a professional human
down to one question: are its jokes as funny
joke writer.
as those crafted by humans? Prior studies
have typically relied on numerical ratings
To determine whether Witscript is as funny as a
assigned by human evaluators—a method human expert, a reliable method for evaluating the
with inherent limitations—and few have funniness of jokes is necessary. Goes et al. (2022)
directly compared the quality of AI- use an AI model, but papers on the computational
generated jokes to that of jokes created by generation of humor almost always evaluate the
professional human joke writers. In this generated text using non-expert humans recruited
study, we measured audience laughter—a on crowdsourcing platforms like Amazon
direct and fundamental response to jokes— Mechanical Turk (Loakman, 2023). This
to assess the funniness of jokes produced by evaluation method is probably common because it
a specialized AI joke-writing system. We
is relatively low-cost and easy to carry out.
also compared those jokes to those written
by a professional human joke writer to
Nevertheless, using non-expert humans to rate
determine which elicited more laughter. jokes on a numerical scale has significant
Our findings reveal that the AI-generated limitations (Amin, 2020; Hossain, 2020; Inácio,
jokes got as much laughter as the human- 2024; Valitutti, 2013). Evidence indicates that non-
crafted ones. This suggests that the best AI expert humans cannot appropriately evaluate the
joke generators are now capable of quality of creative text (Lamb, 2015). In the case of
composing original, conversational jokes jokes, this apparent inability to judge quality may
on par with those of a professional human arise because jokes are designed to elicit laughter,
comedy writer. not high numerical ratings. Indeed, a common
definition of "joke" is "something said or done to
1 Introduction provoke laughter" (www.merriam-
Generating humor is often regarded as an AI- webster.com/dictionary/joke). And
complete problem, one that requires full human "voiced laughter is correlated with highly amusing
intelligence to solve (Hurley, 2011; Winters, 2021). multimedia content" (Petridis, 2009). So, we
Generating original humor is even a challenge for believe that measuring the laughter elicited by a
humans (Amir, 2022; Tikhonov, 2024), and the joke is a better way to measure its funniness.
brains of professional comedians are distinct Laughter is strongly influenced by social
functionally and structurally (Amir, 2016; Brawer, context. We laugh the most when we interact with
2021). Witscript is one of the few AI systems that someone in person, instead of via voice or text
can generate contextually integrated jokes, like the (Scott, 2014). Therefore, to ensure a stronger
jokes a human might improvise in a conversation laughter signal that can be more accurately
(Toplyn, 2021b). The last time Witscript was measured, a joke should be delivered to a group of
systematically evaluated (Toplyn, 2023), human people by someone in their presence. Delivering
evaluators judged its responses to input sentences the joke to a group would also help compensate for
to be jokes 44% of the time. Since then, the the fact that evaluating humor is subjective: if a
71
Proceedings of the 1st Workshop on Computational Humor (CHum), pages 71–78
January 19, 2025. ©2025 Association for Computational Linguistics
joke elicits a big laugh from a group, that means From the remaining news headlines, JT selected
many people thought it was funny and, therefore, eight that, in his expert opinion, had two
that the joke can objectively be assigned a high characteristics that made them particularly well-
funniness rating. We decided, then, that the most suited for joke writing: (1) they were likely to
reliable way to measure the funniness of jokes like capture most people's interest, as good joke topics
those generated by Witscript is to measure how do (Toplyn, 2014); and (2) they were relatively
much laughter they elicit when they are delivered "evergreen"—likely to seem fresh indefinitely—so
by professional standup comics in front of live jokes based on them wouldn't get stale and unfunny
audiences. before testing was completed.
Then JT edited each of the eight selected news
2 Related Work headlines into a form that he believed, in his expert
opinion, would make it a useful joke topic. Each
Other authors have tasked human evaluators with resulting topic had the following characteristics:
comparing the funniness of jokes written by (1) it was one sentence; (2) it was likely to be easily
humans to that of jokes generated by AI systems. understood by its intended audience of adult
But those authors used numerical scales, not Americans; and (3) it was relatively simple, with
measurements of laughter, to rate the funniness of only one or two attention-getting elements, which
the output (Gorenz, 2024; He, 2019; Mittal, 2022; Toplyn (2014) calls "topic handles."
Petrović, 2013; Tikhonov, 2024; Zhang, 2020). To
the best of our knowledge, this paper represents the 4.2 Joke Production
first time that jokes generated by an AI system have
The human expert—a longtime joke writer for a
been formally evaluated in the context of standup
well-known, U.S.-based, late-night comedy/talk
comedy performances.
show—and Witscript, operated by JT,
independently generated jokes based on the eight
3 Description of the Witscript System
edited topics. They were given three days to
Witscript is a neural-symbolic hybrid AI system complete the task to the best of their ability, so that
designed to work in American English (Toplyn, the speed of their joke production would not be a
2023). It's symbolic because it incorporates joke- factor.
writing algorithms created by a human expert The human expert and JT each selected from all
(Toplyn, 2014). And it's neural because it executes of their own output the one joke for each topic that
those algorithms, and other joke-production they believed would elicit the most laughter from
methods, by calling on a large language model in an audience of typical American adults. They
the GPT family from OpenAI (Brown et al., 2020). submitted their eight chosen jokes to a third-party
The Witscript jokes used in this research were data manager without sharing them with each
generated by the version of the Witscript app that other. All of Witscript's selected jokes were
was publicly available on October 9, 2024, from submitted exactly as they were output by Witscript.
www.witscript.com. The algorithms are
4.3 Laughter Measurement
based on formulas described in Toplyn (2014) and
several patents (Toplyn 2020a, 2020b, 2021a). Experienced standup comics performed two
comedy sets in front of live audiences in comedy
4 System Evaluation venues in the U.S. The comics did not reveal the
sources of the jokes and did not know which jokes
4.1 Input Selection had been written by AI. In each set, jokes based on
Author OA selected 16 current news headlines for all of the eight topics were performed, with half of
use in evaluating Witscript. Author JT, a the punchlines written by the human expert and
professional comedy writer, eliminated any of half by Witscript. Both the order of the topics and
those headlines that were strongly associated with which punchline was selected for each topic were
events occurring after the knowledge cutoff date of determined randomly and counterbalanced
the GPT model used by Witscript. That way, between sets. As a cover story, the comics
Witscript's performance wouldn't be adversely explained that they would be performing some
affected by the system's dependence on non- jokes written by a friend.
current training data.

72
To measure the quantity of laughter elicited by
each joke, the recording of each set was labeled to
mark the segments in which laughter occurred. The
original audio was then converted to a graph of
decibels over time using Formula 1.
𝑑𝐵 = 20 ∗ 𝑙𝑜𝑔10(|𝑠| + 1𝑒 −6 ) (1)
In the formula, s is the original sound wave, and
-6
1e is the lowest sound level perceivable by
humans. The area under the curve, representing the
"quantity of laughter," was then computed using
Simpson’s numerical integration method
implemented in Python (Matthews, 2004). We refer
to the measure as Total Laughter; its units are
decibel-seconds (see Figure 1). We believe this
method best captures the quantity of laughter
compared to other potential methods such as the
average, median, or max, as those other methods Figure 1: A demonstration of how the "Total
would be poor at capturing situations in which Laughter" of a single joke is measured. The sound
different individuals in the audience "get the joke" wave of the laughter segment following the joke is
at different times, resulting in the same amount of converted to dB over time. The area under the
laughter spread over a longer period of time. curve (here 241) is the Total Laughter in decibel-
seconds.
For the present analysis, we used the audio of
two high-quality sets performed at the same North characteristics affecting the overall loudness of its
Hollywood venue by the same comedian, Mike laughter. That normalization was achieved by:
Perkins, with audience sizes of 35 and 15. The sets 1. Prior to conducting a paired t-test, we compared
were performed a month apart at the same time of two joke versions across sets. The Loudness
day (at the end of the comedian's 10-minute set measure of all the jokes within a set was
opening the 8 p.m. show). Two other sets were normalized by the median Loudness across all
excluded from the analysis either because of poor jokes in the set.
venue quality or small audience size (N<10). 2. For the GLM the Set was included as a regressor
The audio tracks were annotated to select the of no interest.
segments of laughter associated with each joke. In
a typical set, sounds unrelated to the laughter, such 4.4 The Hypothesis
as heckling, would mix with the laughter. However, Historically, the standard for demonstrating that AI
these interferences were not an issue in the sets we had reached a certain milestone against human
analyzed. Additionally, comedians might speak performance involved only a few data points. For
over the laughter to make a comment or start the example, Kasparov played only six games with
next joke. But in the sets we analyzed, the Deep Blue (AI) in 1997 (scoring 2.5-3.5). In 2011,
comedian made an effort to let the audience laugh Watson (AI) competed only once against two
uninterrupted, though he often did start the next human champions on Jeopardy!, and won. While
joke when he felt the laughter was dying down. We such events would not meet the nominal standards
always ended the laugh segment before the of statistical significance required to determine that
comedian resumed talking, so the audio segment AI was "consistently" better than the human
contained laughter only. Importantly, the comedian champions, they are nevertheless considered
was not aware which jokes had been written by AI, meaningful milestones, since before those events it
so any such interference affected all jokes equally. was considered inconceivable that AI would
We compared the performance of Human vs. AI perform at the level of those human champions
jokes within sets and between sets. The between- even once.
sets comparison required some form of If generating jokes for a comedy/talk show-style
normalization of the laughs to remove any bias monologue, where the quality is judged by
resulting from the size of the audience or other

73
How reliable is the measure itself? The measure
captures the total laughter of an audience of N=35
and 15 in Sets 1 and 2, respectively. In a classical
experiment, jokes are rated by a handful of raters.
While audience members' responses are not
entirely independent (e.g., laughter is contagious)
whatever effect audience members had on each
other was present for all jokes and presumably had
the effect of signal amplification rather than of
cancellation of individual differences.
Additionally, unlike with raters, it is not possible to
tease apart the contributions of individual raters
(here, audience members). Despite these
drawbacks, the number of raters/audience
members is much larger than in a typical study in
the field, suggesting higher reliability than the
standard. The validity of the measure is arguably
Figure 2: The jokes written by the human expert higher since the measure is of a natural response to
(H) and Witscript (AI) in order of the Total jokes in a natural environment. However, there
Laughter they elicited in Set 1. Joke ID may be other forms of humor for which a
corresponds to the actual order in which the jokes traditional approach using numerical ratings would
were told. The jokes are listed in the Appendix. be better suited than our measurement method.
audience laughter, was an AI-complete problem, How did the Human and AI jokes compare? The
we would expect that: funniest joke (area under the curve = 241) was
written by AI. On average, AI did slightly better (M
H0: None of the AI-generated jokes would perform = 106, SD = 96) than the Human (M = 104, SD =
better than any of the professional human writer’s. 86) in Set 1, with the reverse true in Set 2 (AI: M =
We could reject this hypothesis if: 66, SD = 21; Human M = 99, SD = 93). However,
these differences were not significant (both sets:
H1: Some of the AI-generated jokes performed Mann-Whitney U(4,4) = 8.0, ns). The lack of
better than some of the Human’s. statistical difference between the groups is not
meaningful with the present sample size. Instead,
5 Results and Discussion as explained above (see the hypotheses), we rely on
a standard similar to Deep Blue's and Watson’s,
5.1 Analysis Within a Set that of a limited live demonstration of equivalence
Figure 2 displays the eight jokes performed in Set to human performance, which we have met.
11 ranked by the Total Laughter they elicited. Three
5.2 Comparison Between the Sets
of the four jokes written by AI elicited more
laughter than at least one joke written by the human As described above, the two sets had the same eight
expert. Additionally, the joke that elicited the most topics, for which half of the punchlines were
laughter was AI-written. written by AI and half by the Human. The jokes
This result is in line with H1, in that some of the were counterbalanced so that if a particular topic
AI-written jokes did better than some of the had a punchline written by the Human in Set 1 it
Human’s. The same pattern held true for Set 2; see would have a punchline written by AI in Set 2, and
the Appendix for the data. If we deem this result to vice versa.
be reliable, we can conclude that writing the type The audience size for Set 1 was bigger than for
of humor analyzed here is not AI-complete. How Set 2 (35 vs. 15), resulting in longer laugh times (M
can we determine this reliability? = 2.16 sec. vs. 1.71 sec.) and greater values on our
Total Laughter metric (M = 105 vs. 83). But

1 Set 1 had the bigger audience (N=35). It would be


inappropriate to display jokes from both sets in this figure
because of the difference in the Loudness baseline.
74
3. It provides further evidence that computational
joke generation is best accomplished by taking a
hybrid neural-symbolic approach.
4. It provides further evidence that at least one type
of humor, generating monologue-style jokes for an
American audience, is not AI-complete.

7 Conclusion
AI-written jokes, performed in front of a live
audience, elicited laughter within the same range as
jokes written by a professional human comedy
writer.
Some AI-written jokes ranked higher than some
of the human-written jokes, and the funniest joke,
as measured by quantity of laughter, was written by
AI.
The study provides naturalistic, real-world
Figure 3: The Median Laughter Loudness (over the evidence that when it comes to generating
duration of the laugh) elicited by the Human (H)-
comedy/talk show monologue-style humor, an AI
vs. AI (A)-written jokes for each topic across the
two sets. The lack of pattern suggests equivalent system can perform at the level of a professional
performance by the Human and AI sources. human comedy writer.

Median Laughter Loudness showed no difference 8 Limitations


(for both sets, M = 48). Controlling for that
baseline, no significant differences were observed 1. Several aspects of the performances may have
between the AI and human-written versions of the contributed to a joke's funniness beyond the quality
joke for each topic. This was true for our Total of its writing. These include the comic's vocal
Laughter metric as well as for other measures, delivery and any gestures and facial expressions he
including Mean Loudness, Median Loudness, and chose to make. We assume these factors influenced
Length of Laugh (all paired t values < 1, ns; a GLM AI and human-written jokes equally, since the
statistically controlling for Set effects returned the comic did not know which jokes had been written
same result). Since there was no difference in the by AI. This kind of noise is the price of conducting
Median Laughter Loudness, that metric lends itself an arguably more valid naturalistic study. It is not
to a bar graph comparing the two sets which has no likely to reflect systematic bias.
distortions resulting from normalization; see 2. Our measure captures the funniness ratings of
Figure 3. the ~50 audience members for the two sets.
Overall, comparing the AI and Human jokes on However, the audience members cannot be
the same topic between sets mirrors the result of considered fully independent (e.g., laughter is
comparing the AI and Human jokes within the contagious). That acknowledged, whatever
sets—there is no difference in the effectiveness of influence audience members had on each other, it
the jokes. was likely a constant factor of amplification
affecting all jokes similarly.
6 Contributions 3. The Witscript jokes submitted for evaluation
were cherry-picked by a human expert from all of
This paper makes the following contributions: the jokes generated by Witscript on the assigned
1. It introduces a novel method of evaluating the topics. However, we don't consider that to be a
funniness of jokes—measuring the laughter they major limitation because the human jokes
elicit. submitted for evaluation were similarly cherry-
2. It demonstrates a way to compare the joke- picked from multiple joke candidates crafted by the
writing ability of an AI system to that of a human human writer.
expert in the real-world setting of standup comedy.

75
Acknowledgments Marcio L. Inácio and Hugo G. Oliveira. 2024.
Generation of Punning Riddles in Portuguese with
We would like to thank the following comedians Prompt Chaining. 15th International Conference on
for their insights and help: Mike Perkins, Kevin Computational Creativity (ICCC'24).
Hickerson, Ajitesh Srivastava, and the comedy/talk
Carolyn Lamb, Daniel G. Brown, and Charles L.A.
show writer who wrote the Human jokes for the Clarke. 2015. Human Competence in Creativity
experiment. Evaluation. Sixth International Conference on
Computational Creativity.
References
Tyler Loakman, Aaron Maladry, and Chenghua Lin.
Miriam Amin and Manuel Burghardt. 2020. A Survey 2023. The Iron(ic) Melting Pot: Reviewing Human
on Approaches to Computational Humor Evaluation in Humour, Irony and Sarcasm
Generation. In Proceedings of the 4th Joint Generation. Conference on Empirical Methods in
SIGHUM Workshop on Computational Linguistics Natural Language Processing.
for Cultural Heritage, Social Sciences, Humanities
and Literature, pages 29–41, Online. International John H. Matthews. 2004. Simpson’s 3/8 Rule for
Committee on Computational Linguistics. Numerical Integration, Numerical Analysis-
Numerical Methods Project.
Ori Amir et al. 2022. The elephant in the room:
attention to salient scene features increases with Anirudh Mittal, Yufei Tian, and Nanyun Peng.
comedic expertise. Cognitive Processing, 23(2), 2022. AmbiPun: Generating Humorous Puns with
203-215. Ambiguous Context. In Proceedings of the 2022
Conference of the North American Chapter of the
Ori Amir and Irving Biederman. 2016. The Neural Association for Computational Linguistics: Human
Correlates of Humor Creativity. Frontiers in Human Language Technologies, pages 1053–1062, Seattle,
Neuroscience, 10(597). United States. Association for Computational
Jacob Brawer and Ori Amir. 2021. Mapping the ‘funny Linguistics.
bone’: neuroanatomical correlates of humor Stavros Petridis and Maja Pantic. Is this joke really
creativity in professional comedians. Social funny? Judging the mirth by audiovisual laughter
Cognitive and Affective Neuroscience, 16(9), 915- analysis. 2009. In 2009 IEEE International
925. Conference on Multimedia and Expo, New York,
Tom B. Brown et al. 2020. Language models are few- NY, USA, pp. 1444-1447, doi:
shot learners. arXiv preprint arXiv:2005.14165. 10.1109/ICME.2009.5202774.

Fabricio Goes, Zisen Zhou, Piotr Sawicki, Marek Saša Petrović and David Matthews.
Grzes, and Daniel G. Brown. 2022. Crowd score: A 2013. Unsupervised joke generation from big data.
method for the evaluation of jokes using large In Proceedings of the 51st Annual Meeting of the
language model AI voters as judges. arXiv preprint Association for Computational Linguistics (Volume
arXiv:2212.11214. 2: Short Papers), pages 228–232, Sofia, Bulgaria.
Association for Computational Linguistics.
Drew Gorenz and Norbert Schwarz. 2024. How funny
is ChatGPT? A comparison of human- and AI- Sophie Scott, Nadine Lavan, Sinead Chen, and Carolyn
produced jokes. PLoS ONE 19(7): e0305364. McGettigan. 2014. The social life of laughter.
https://doi.org/10.1371/journal.pone.0305364. Trends in Cognitive Sciences, 18(12), 618–620.
https://doi.org/10.1016/j.tics.2014.09.002.
He He, Nanyun Peng and Percy Liang. 2019. Pun
Generation with Surprise. North American Chapter Alexey Tikhonov and Pavel Shtykovskiy. 2024. Humor
of the Association for Computational Linguistics. Mechanics: Advancing Humor Generation with
Multistep Reasoning. arXiv preprint
Nabil Hossain, John Krumm, Michael Gamon, and arXiv:2405.07280.
Henry Kautz. 2020. SemEval-2020 Task 7:
Assessing Humor in Edited News Headlines. In Joe Toplyn. 2014. Comedy Writing for Late-Night TV:
Proceedings of the Fourteenth Workshop on How to Write Monologue Jokes, Desk Pieces,
Semantic Evaluation, pages 746–758, Barcelona Sketches, Parodies, Audience Pieces, Remotes, and
(online). International Committee for Other Short-Form Comedy. Twenty Lane Media,
Computational Linguistics. LLC, Rye, New York.

Matthew M. Hurley, Daniel C. Dennett, and Reginald Joe Toplyn. 2020a. Systems and Methods for
B. Adams. 2011. Inside Jokes: Using Humor to Generating Jokes. U.S. Patent No. 10,642,939.
Reverse-Engineer the Mind. MIT Press. Washington, DC: U.S. Patent and Trademark Office.

76
Joe Toplyn. 2020b. Systems and Methods for That's on them, we subcontracted to Boeing.
Generating Comedy. U.S. Patent No. 10,878,817. AI: (TL: 51, T: 1.04, ML: 50)
Washington, DC: U.S. Patent and Trademark Office. They're especially concerned since the leak is
Joe Toplyn. 2021a. Systems and Methods for coming from one of their astronauts' space diapers.
Generating and Recognizing Jokes. U.S. Patent No.
11,080,485. Washington, DC: U.S. Patent and Joke 2
Trademark Office. Topic:
Joe Toplyn. 2021b. Witscript: A System for Generating Why do TV stations air false political ads?
Improvised Jokes in a Conversation. In Proceedings Human: (TL: 51, T: 1.04, ML: 50)
of the 12th International Conference on That's so after the election, we welcome the sound
Computational Creativity, 22–31. Online: of "Attention, Hemorrhoid Sufferers!"
Association for Computational Creativity.
AI: (TL: 101, T: 1.96, ML: 53)
Joe Toplyn. 2023. Witscript 3: A Hybrid AI System for Because they want to make sure the viewers are
Improvising Jokes in a Conversation. arXiv, just as confused as the candidates!
abs/2301.02695.
Alessandro Valitutti, Hannu Toivonen, Antoine Joke 3
Doucet, and Jukka M. Toivanen. 2013. “Let Topic:
Everything Turn Well in Your Wife”: Generation of A company just introduced a virtual dog leash that
Adult Humor Using Lexical Constraints. In uses wireless technology.
Proceedings of the 51st Annual Meeting of the
Human: (TL: 73, T: 1.58, ML: 47)
Association for Computational Linguistics (Volume
2: Short Papers), pages 243–248, Sofia, Bulgaria. Wifi can control my dog's movements? So where's
Association for Computational Linguistics. his virtual pooper scooper?
AI: (TL: 53, T: 1.54, ML: 36)
Thomas Winters. 2021. Computers Learning Humor Is
No Joke. Harvard Data Science Review, 3(2).
But I'm pretty sure that's just a fancy way of saying
doi.org/10.1162/99608f92.f13a2337. 'I don't want to walk my dog.'
Hang Zhang, Dayiheng Liu, Jiancheng Lv, and Cheng Joke 4
Luo. 2020. Let's be Humorous: Knowledge
Enhanced Humor Generation. Annual Meeting of
Topic:
the Association for Computational Linguistics. Bob Yerkes, a stuntman who appeared in "Star
Wars," died at the age of 92.
Appendix. The Jokes Human: (TL: 231, T: 3.92, ML: 62)
In his long career, he broke so many bones, his
Below is a full list of the jokes and their Joke ID, grave says Rest in Pieces. But true Star Wars fan to
which indicates the order in which they were told the end, he asked to be buried in his parent's
in the sets. Each joke has a topic that serves as a basement.
prompt/setup for both the AI- and human-written AI: (TL: 48, T: 1.00, ML: 49)
punchlines. Each set randomly includes half of the He passed away surrounded by his loved ones and
punchlines written by AI. Next to each joke, we a strategically placed pile of mattresses.
also provide these metrics for the laughter it
elicited: Total Laughter, in decibel-seconds (TL); Joke 5
total laugh Time, in seconds (T); and Median Topic:
Laughter Loudness over the duration of the laugh, BuzzFeed put out a list of 31 things to buy when
in decibels (ML). you finally decide to update your kitchen.
Human: (TL: 36, T: 1.08, ML: 33)
Joke 1 If you ask me, appliances are too smart already. The
Topic: clock on my coffee maker flashes 12 12 12...
A new report says that NASA officials are worried What'll it do smarter--snicker? "Tsk tsk tsk. So
about a leak on the International Space Station. much for caffeine increasing brain function."
Human: (TL: 79, T: 2.00, ML: 40) AI: (TL: 25, T: 0.92, ML: 26)
Will they fix it? Naw, even in space, landlords don't Number 32 on the list: a new Buzzfeed article on
fix leaks. 31 ways to use all the unnecessary gadgets you
"But Houston, we have a potty problem." bought from the first list.

77
Joke 6
Topic:
Scientists have discovered a sixth ocean more than
400 miles below the surface of the Earth.
Human: (TL: 60, T: 1.29, ML: 48)
Great, I was just looking for a gnarly new place to
surf. (mime surfing around dangers) "Stalactite!
Stalagmite! Bats! Gollum!!"
AI: (TL: 94, T: 1.54, ML: 62)
Looks like Aquaman's commute just got a whole
lot longer.

Joke 7
Topic:
Scientists are studying whether astronauts in the
future could transform rocks into food.
Human: (TL: 236, T: 4.62, ML: 52)
Hey, don't give Fruity Pebbles any ideas. Rocky
Road with real rocks? You could chip a tooth on
Stone Ground Mustard!
AI: (TL: 241, T: 4.79, ML: 52)
Which is great news for anyone who's ever had a
craving for a pebble pie.

Joke 8
Topic:
A new study says that young children in the UK get
almost half their calories from ultra-processed
food.
Human: (TL: 46, T: 0.88, ML: 57)
If you think that's bad, the other half is British
cooking.
AI: (TL: 70, T: 1.79, ML: 40)
The most popular kids' meals in the UK are now
the Happy Meal, the Crispy Chicken Sandwich,
and Uncle Nigel's Deep-Fried Crumpets.

78

You might also like