“London's climate is changing. We're having hotter, drier summers and warmer, wetter winters. We're also having extreme weather like heavy rainfall and heatwaves more often.” - Mayor Of London Website
Individual opinions aside, it’s of little debate that the weather patterns in London are changing. In this example we’ll show you how to analyze 15,341 observations of London’s weather with Gigasheet’s #NoCode Data Science techniques! You won’t need to write long SQL queries, code manual parsers for your data, or tweak programs based on your data type - long gone are the days. Today, let’s explore a London weather dataset and see how Gigasheet can help crunch that data into valuable insights!
Let’s get the data first; I’ll be using the London Weather Data dataset available on Kaggle for processing. Simply download the CSV sheet and head back to Gigasheet. Press New, select File Upload, and drag your file over to the modal.
That’s it - sit back, relax, and let Gigasheet take over the boring bits of processing a CSV sheet.
Once processed, the file should be ready for display.
Not all bits of this dataset are going to be super helpful. To make it more relevant to our end-goal, let’s clean-up a few columns and only leave relevant data for us to process later on.
Starting off, I see the date column is written in the ‘YYYYMMDD’ format. Let’s split it into three separate columns - year, month, and date. How can I do that? Let’s use the Split Column function to split the date column (without specifying a separator) and that should give us 8 columns.
Now, let’s use the Combine Columns function and select relevant columns to form our desired columns. For instance, here I create the Year column by combining the first four fields. Repeat the same for month and day.
Once done, you can hide the split columns. Open up the Columns View from the right navigation bar on your screen and un-select the columns which aren’t needed anymore.
Now, let’s look at the data from an objective set of eyes and see what kind of insights we can generate from it. Based on the dataset’s description, we know that the data itself is recorded from 1979 to 2021. First off, let’s compare the records a bit and try to find out which year witnessed the hottest day in London.
To explore that, let’s first Group the data using the Year column (we previously created).
Next (and spoiler already), let’s apply some aggregations to the mean_temp (Mean Temperature) column. All you need to do? Click the down arrow next to the values and select your desired aggregation. For instance, I’d like to use the Max aggregation to chart my data.
What’s better than a spreadsheet? Visualizations! We’ll use a chart to visualize the data to find changes in the maximum temperature across all these years. Select rows from both the columns - year (grouped) and max_temp (averaged), right-click, and select Line from the Chart Range option.
Here, let me show you how:
Londoners seem to have experienced the hottest day - with a maximum temperature of a whopping 37.90 degree (celsius) in 2020, 2019, and 2003. Are the hottest days yet to come? Based on the data, we can predict that there can in fact be increases in temperature.
What about the coldest day? Let’s do a quick reverse search.
Tip: You can customize your charts to the smallest of details - navigators, titles, labels, fonts, designs - that’s all up to you. Simply click the down arrow at the right side of the chart and take the lead!
Repeat the aggregations with the min_temp column and chart the values against the grouped year. 1981 experienced the coldest data per this dataset. Not just this - if you begin exploring the lower-ends of this data, you’ll see that recent years experience a lot more hotter than colder months.
Is Global Radiation a contributor to high and low temperatures in London? Let’s first graph out maximum radiation figures per year and then chart it against maximum temperatures as well. Not a huge trend here but 2020 does seem to have experienced fairly high levels of global radiation.
What about its effect on the temperature? Our dataset doesn’t suggest a causality relationship between the Global Radiations and Maximum Temperature. We’ve got the same temperature in three years yet the radiation levels differ. Perhaps there’s a different, undocumented variable at play here?
Precipitation might be a good spot to pivot our analysis on. Is there significant change in the values? Does London experience a lot more rain than before? I’ll apply the Average aggregation on the precipitation column. Now, let’s chart it against the grouped years as well.
Not a significant change in the evolution of precipitation. We do have years with higher above-average values but that’s not enough to stick out of a data-set of 20+ years.
To conclude, here’s a summary of our observations:
That’s a wrap from me; now it’s your turn to process your data and explore insights like never before. Want to go a step further from my analysis? You could even join this dataset with the London Energy Data dataset (using the date field as the joining factor) and produce a lot more cool insights!
Hooked to our #DataExploration series? We’ve got a few more articles which might interest you. Give these a read: