Published in · 5 min read · Sep 3, 2022
--
tl;dr
1. Using pandas to read from public Google Sheets2. Using gspread + service account from Google Cloud Developer Console to read and write to accessible Google Sheets
Google Sheets is a good tool when it comes to storing tabular information and has a low learning curve for people coming from Excel or SQL backgrounds. Having worked with businesses that use Google Sheets to store information, it’s not surprising that there will come a time when you need to read data from Google Sheets to do some processing and analyzing.
When it comes to reading the data from Google Sheet using Python, there are 2 quick and easy ways that I am currently using depending on certain circ*mstances:
- Whether you need only read access to the sheet or you need write access as well
- Whether the sheet is public/private
- If it’s private, whether it’s owned by you or others and whether it’s possible to create a service account using Google Cloud Developer Console
Pros — Very easy to set up, can be only 3 lines of code to have data from Google Sheets loaded into a dataframe
Cons — Doc has to be public, thus not as secure and not recommended for confidential information
Since Google Sheets is usually used to store tabular data, the decision to load the data into a pandas dataframe comes naturally. In order to do so, firstly, the doc has to be public and the ID has to be identified via the doc URL. For example, below is a test doc that I created that contains the monthly historical stock price for Google and Apple:
https://docs.google.com/spreadsheets/d/1-QFHeOYXZt6_wL3UElMnUgiGs9ccmKeE3rBGpfA9ru0/edit#gid=0
The bolded section will be the doc ID that we need in the Python script. By modifying the doc URL to use gviz (Google Visualization) API and the export output format to be csv…