generating whalefacts with GPT-2 🐳

my favourite Twitter account in existence just might be @awhalefact.

language models and text generation hit a new state-of-the-art this year with OpenAI's GPT-2 model, and some of the examples are wild.

i decided to combine some of my favourite things (whales! machine learning! twitter humor!) and finetune the 345M-parameter version of GPT-2 on tweets from @awhalefact.

(in order to do this, i had to scroll through Twitter search, collect up all the whalefacts, and parse them out into tweets due to the limitations on the official Twitter API. machine learning is occasionally glamorous.)

3661 tweets and a few hundred training steps later, the results are pretty great.


just like all of your dads, whales never forget a dog’s name

most whales cannot speak english so please go ahead and speak the language i Whale!

it seems to learn actual kind-of whalefacts...


whales are not judged on their swim bladder volumes

it picks up on timely conversation topics...


I think if a whale had a gun, he would probably be a lot less worried about people

the model has also learned that it was recently Pride month, and that some people (whales?) live underwater!

a whaley happy pride month to all those living and breathing underwater!!!

it learns to swear and make fun of popular news cycle topics:


from a scientific standpoint, there is no such thing as the fuckin prince of whales

the model also learns semantic representations of its fellow underwater dwellers! 🦑


shoutout to giant squid and my friends!!!


it also learns to adapt jokes from the actual whalefact Twitter. a real whalefact tweet is the whales would probably give a lot more hugs if humans weren't so small and breakable. GPT-2 follows up with,



i mean i'll give a lot more hugs if humans weren't so small and shatterable

a real whalefact is whales are even bigger than the hulk, and that's like his thing. GPT-2 adapts the joke to,


whales are even bigger than the hulk so please be nice

it also learns an ego problem... 🐳👑


so adorable i am a whale i am a whale!! 💙🐳


as whales are born, whales can also rule

now, isn't that fun?

i've forked the original finetuning repo on GitHub and added my training data and other minor tweaks. you can read more whalefakes on Twitter @whalefakes.

i hope you have an excellent day! or GPT-2 would say,


krill it!! 🐳

note: i tried to validate that all of these whalefakes are not exact duplicates of whalefacts, but i might have missed some! if they happen to be duplicates, that would be due to overfitting.

p.s. i also finetuned GPT-2 on tweets by the excellent @jonnysun, and the internet response was wild! more about that here.

