Working with Spreadsheets using Python (Part 2)

Written by i_tarun_gupta | Published 2017/02/05
Tech Story Tags: python | microsoft-excel | spreadsheets | experience | technology

TLDRvia the TL;DR App

If you are here reading this article then you have come a long way in working with spreadsheets using Python. But if you are here directly then I would recommend you to go over part 1 of this article which will give you insight from the beginning on this topic.

Here we will discuss use of the Python package openpyxl for creating spreadsheets in the .xlsx format. As I discussed in part 1, that if you have lots of data to work with and many clients to report that data to, then knowing your way around both .xls and .xlsx (older and newer Microsoft Excel versions respectively) formats is a great skill that can come in handy lots of times.

The package openpyxl is a Python library for reading and writing Excel 2010 xlsx/xlsm/xltx/xltm files_._

openpyxl - A Python library to read/write Excel 2010 xlsx/xlsm files - openpyxl 2.4.2 documentation_Edit description_openpyxl.readthedocs.io

Yes, you read that right you can read, write and append data using only one single package. You can install this package using the following command on terminal or command prompt (Prerequisite: pip must be installed on you system. For install instructions of pip visit this link).

pip install openpyxl

After the installation is complete, open the text editor to create a new file and give it any name you like; I am giving it the name excelScript.py .

Inside the script write the following code:

import openpyxl

def main():book = openpyxl.Workbook()book.create_sheet('Sample Sheet')book.save('Sample.xlsx')

if __name__ == '__main__':main()

The code above creates a blank Excel file with the name “Sample.xlsx” , containing a single sheet named “Sheet 1”. This was an example of .xlsx file creation. Run this script by executing

python excelScript.py

on your command line. When you open this .xlsx file you will see that instead of one there are two sheets created in the file. This happens because when you make an object of the Workbook() method of the openpyxl package, it creates a sheet called “Sheet” by default on calling that method.

If the sheet names don’t matter to your work then you can start using this sheet directly and you don’t need to create another sheet. If your work requires only the use of one sheet and you can start writing data directly to this sheet.

Later, I will be showing you how to remove that default sheet Sheet from your .xlsx file. To write data to Excel change the code to the following:

import openpyxl

def main():book = openpyxl.Workbook()book.create_sheet('Sample')

# Acquire a sheet by its name  
sheet = book.get\_sheet\_by\_name('Sample')

# Writing to sheet  
sheet.cell(row=1, column=1).value = 'sample'

book.save('Sample.xlsx')

if __name__ == '__main__':main()

The code above will write “sample” on the 1st row and the 1st column. Keep in mind that openpyxl reads row and column indices starting from one. This way you can write into your Excel file and create as many sheets you like. Run the script above using

python excelScript.py

in your command line. You can give it styling also.

The following code shows you how to do so:

import openpyxl

def main():book = openpyxl.Workbook()book.create_sheet('Sample')

# Acquire a sheet by its name  
sheet = book.get\_sheet\_by\_name('Sample')

# Writing to sheet  
sheet.cell(row=1, column=1).value = 'sample'

# To set alignment of text inside cell and text wrapping  
sheet.cell(row=1, column=1).alignment = openpyxl.styles.Alignment(horizontal='center', vertical='center', wrap\_text=True)

# To make font bold or italic  
sheet.cell(row=1, column=1).font = openpyxl.styles.Font(bold=True, italic=True)

book.save('Sample.xlsx')

if __name__ == '__main__':main()

This will write _“sample” o_n the 1st row and 1st column. It will be both horizontally and vertically center aligned. For information on more methods provided by openpyxl you can read the documentation here.

You can also merge cells, just like we did using [xlwt](https://medium.com/@i_tarun_gupta/380a120387f#.mi52gnqdu) , in openpyxl. The following code shows how to achieve merged cells using openpyxl:

import openpyxl

def main():book = openpyxl.Workbook()book.create_sheet('Sample')

# Acquire a sheet by its name  
sheet = book.get\_sheet\_by\_name('Sample')

# Merging first 3 columns of 1st row  
sheet.merge\_cells('A1:C1')

# Writing to sheet  
sheet.cell(row=1, column=1).value = 'sample'

book.save('Sample.xlsx')

if __name__ == '__main__':main()

The code above demonstrates how to merge cells when you know the exact designations of the cells to merge. I knew the designation of the 1st column of the 1st row which is A1 , and the designation of the 3rd column of the 1st row which is C1. In this case, I merged them directly using the merge_cells method of openpyxl.

This was one of the two methods to merge cells using openpyxl. Here is the demonstration of the 2nd method:

import openpyxl

def main():book = openpyxl.Workbook()book.create_sheet('Sample')

# Acquire a sheet by its name  
sheet = book.get\_sheet\_by\_name('Sample')

# Merging first 3 columns of 1st 3 rows  
r1 = 1  
r2 = 3  
c1 = 1  
c2 = 3  
sheet.merge\_cells(start\_row=r1, start\_column=c1, end\_row=r2, end\_column=c2)

# Writing to sheet  
sheet.cell(row=1, column=1).value = 'sample'

book.save('Sample.xlsx')

if __name__ == '__main__':main()

This method comes in handy when data is written dynamically to the Excel file. If you are not sure of the exact designations of the cells to merge then you can use this method. This will merge the 1st three columns of 1st three rows and make it into a single cell.

When merging cells always keep in mind that you need to write the value into the first column of the merged cell otherwise the value will not reflect in the excel file. You can still apply styling to a merged cell in the same way demonstrated.

Now coming to the part of deleting that extra sheet that is created by default on calling the Workbook() method of openpyxl. This is how it can be done:

import openpyxl

def main():book = openpyxl.Workbook()

book.create\_sheet('Sample')

extraSheet = book.get\_sheet\_by\_name('Sheet')  
book.remove\_sheet(extraSheet)

book.save('Sample.xlsx')

if __name__ == '__main__':main()

Here we acquired the sheet named “Sheet” in object form using get_sheet_by_name() method and stored in the extraSheet variable and then removed the sheet by calling the remove_sheet() method on the extraSheet variable.

This was all about creating a new .xlsx file, writing to it, merging cells, and styling it. Now, you might be wondering how to read and append data into .xlsx file using openpyxl.

First of all let’s look at reading an .xlsx file using openpyxl. Just as writing, you can use openpyxl.load_workbook() to open an existing workbook:

import openpyxl

def main():book = openpyxl.load_workbook('Sample.xlsx')print book**.**get_sheet_names()# ['Sheet2', 'New Title', 'Sheet1']

# Get a sheet to read  
sheet = book.get\_sheet\_by\_name('Sheet1')

# No of written Rows in sheet  
r = sheet.max\_row

# No of written Columns in sheet  
c = sheet.max\_column

# Reading each cell in excel  
for i in xrange(1, r+1):  
    for j in xrange(1, c+1):  
        print sheet.cell(row=i, column=j).value

if __name__ == '__main__':main()

The code above will print all the written cells in the Excel file. The for loop starts from 1 instead of 0 because openpyxl starts indexing from 1.

You can also read cells using the designations of the cells:

# prints 1st row, 1st column directlyprint sheet['A1']

# get a range of cellscells = sheet['A1:C1']

for cell in cells[0]:print cell.value

I selected cells[0] because sheet['A1:C1'] creates a tuple of tuple and all the cell objects are stored at oth index of the tuple hence, cells[0].

Finally, let’s see how to append data to .xlsx file. The following code demonstrates this process:

import openpyxl

def main():book = openpyxl.load_workbook('Sample.xlsx')sheet = book.get_sheet_by_name('Sample')

sheet.cell(row=5, column=1).value = 'Appended Data'

book.save('Sample.xlsx')

if __name__ == '__main__':main()

The process is very simple indeed. I just loaded the existing .xlsx file into an object and saved it into book variable by calling the load_workbook() method on “Sample.xlsx” file. I then grabbed the sheet “Sample” by calling the get_sheet_by_name() method on the book object.

Now, I can simply start appending data to the sheet in the same way when creating a new file. All those methods that are available at the time of creation of file are available now.

Finally, when you are done writing data to the file don't forget to save the file by calling the save() method.

Additionally, don’t forget to use a code beautifier when coding and try to follow PEP8 standards to make the code more readable. Use PyLint before going live. This will help you maintain a coding standard in your project.

The journey does not end here. This was just the basics of how to read, write and append data to .xls or .xlsx files. You should explore these packages more and make your algorithms according to your requirements depending on the form in which you want your data to be represented inside Excel.

Thanks for reading!

If you enjoyed reading this and would like to read some pieces of life through my eyes, please visit here.


Published by HackerNoon on 2017/02/05