New Years' Resolutions are an interesting universal phenomena. Some sources suggest that they have been around for 4,000 years! Speaking of resolutions, American social psychologist Jonathan Haidt once mused, “How many of our New Year's resolutions have been about fixing a flaw?”
Is Mr. Haidt right? Do people really focus on fixing flaws, like losing weight, quitting bad habits, and curbing spending? Or, do they focus more on honing their strengths?
There is only one way to find out! Let us look at some sentiment data, shall we?
In this post we are going to use Gigasheet's free online CSV viewer feature to explore some data about New Year's Resolutions. It's simple to use and makes data exploration available to everyone. If you can use a spreadsheet, you can use Gigasheet to find data insights!
Here we have a sample dataset. It is a collection of over 5000 tweets about New Years' Resolutions from December 2015. (The good old days before COVID and before Twitter went bananas.) This database also contains demographic and geographical data of users and resolution categorizations. Let us use Gigasheet to explore this dataset and understand if these New Years’ Resolutions are about flaw-fixing, self-improvement, achieving goals, or something else.
This dataset contains the following data fields. The ones in boldface are the ones we will be looking at for this article.
Using Gigasheet, you can open this file (and other large files) in just a few seconds. All you need is an account.
This is how the file looks when uploaded to Gigasheet:
All 5,011 rows, ready to be analyzed!
Now, we are ready to sink our teeth in this vast dataset and get some insight.
If most resolutions are indeed about fixing flaws, they must be the most popular ones among the Twitterati, right? So, let us dissect the most retweeted tweets.
To achieve this, we can simply sort the column retweet_count from the highest value to the lowest. Click on the three lines at the right of the column name and select 'Sort Sheet 9 to 1.'
And that's it.
The first row has our answer. This is our most-retweeted tweet with 4,234 retweets.
Now, you can get all the information about this row just by clicking on it. A pop-up appears on the bottom right, detailing all the fields and values. Neat! (You can also drag this box out and expand it to read all the data fields, without having to individually expand each column.)
The tweet reads:
RT @TweetLikeAGirI: my only New Years resolution is to not spend money on food I honestly might be rich by 2016.
So, the tweet is about not spending a lot of money. (In a way, it does focus on flaws.) But, what about other tweets in the category Finance? Did everyone want to save money in 2015?
To dig deeper, you can use Gigasheet's Group By feature. So, first we will group by resolution_category and then by resolution_topics. This will result in nested groups like so:
After grouping this data, we can conclude:
A better way to understand the pattern finance-related tweets is to visualize this data. To achieve this, simply select the cells you want to visualize, right click, and select the 'Chart Range' option. For this example, let us use a pie-chart.
From the chart, we can easily see that the top Finance related new year's resolutions, excluding other, are:
Did you know that publishers deliberately roll out self-help books during the end of December and the beginning of January? This strategy is driven by sentiment data. Most people believe in the 'New Year, New Me' mantra, and want to kickstart their self-improvement journey during this time. (More power to them!). Even in this dataset, you can see that the resolution_category 'Personal Growth' is the most popular with over 1700 tweets!
One of the quickest ways to make your life better is by quitting habits that harm you. These can be quitting smoking, curbing overeating, stop procrastinating, or reducing phone usage.
Using Gigasheet's filters, we can filter out the resolution categories which may have tweets about quitting bad habits. Let us pick the following: Personal Growth, Health & Fitness, and Time Management/Organization.
But wait, our dataset also has a lot of tweets about Humor, so let us cut them out too. So, resolution_category, resolution_topic, and other_topic must not contain the word 'humor.'
(We will come back to them later.)
Gigasheet's filters allow you to use the AND clause and impose multiple conditions. Here is what our filter looks like:
Let us fine-tune our filter a bit further. Here are some common words you might find in tweets about quitting bad habits:
So, we now need to add a condition that the resolution_topic field must contain any one or more of these terms. Luckily, we can just add these values separated by commas, and Gigasheet will automatically understand what you mean.
Adding this condition to our existing filter, we will get something like this:
Do you think this filter is too cumbersome, and do not want to manually add it every time? You can always save it for later and apply it again with just a click. Learn more about saving filters here.
After applying this filter, we see that 318 tweets are about quitting something.
Grouping them by resolution_topics and visualizing them gives us this neat pie chart:
Interestingly, most people would like to quit smoking and social media, but aren't keen on giving up drinking just yet. The top 3 habit related resolutions are:
We hear you. (22% of resolutions are broken in the first week anyway.) A lot of folks don’t really take the concept of resolutions seriously anyway, and this dataset proves it. Filtering by the resolution_topics 'Humor about not keeping resolutions' and 'Humor about new years,' we get 175 rows.
And by sorting by the highest retweet_count to the lowest, here we have the most retweeted one of them all:
RT @FreakingTrue: My New Years resolution is simply going to be remembering to write 2015 instead of 2014
Sigh, doesn't this happen to us all? *hastily scribbles away the last digit*
At Gigasheet, we want to make 2023 a great year for data analysts, small business owners, campaign managers, and everyone who is interested in playing with large datasets without learning to code.
Gigasheet is a free to use business analytics tool which helps you analyze large datasets without code. No matter the size or format, you can easily view, store, and analyze your dataset with Gigasheet.
Author bio: Apoorva P is a content lead at Ukti.