Monday, September 1, 2025

pyhton3: read xlsx using open3pyxl

Install open3pyxl library on Debian system

# apt-get install open3pyxl
Reading package lists... Done
Building dependency tree... Done
Reading state information... Done
...

Install open3pyxl library on user virtual environment (not messing with python used by Debian)

$ cd mypyenv
~/mypyenv$ source venv/bin/activate
(venv) [user]@[hostname]:~/mypyenv$ pip list
Package Version
------- -------
pip     25.1.1
(venv) [user]@[hostname]:~/mypyenv$ pip install openpyxl
Collecting openpyxl
  Downloading openpyxl-3.1.5-py2.py3-none-any.whl.metadata (2.5 kB)
Collecting et-xmlfile (from openpyxl)
  Downloading et_xmlfile-2.0.0-py3-none-any.whl.metadata (2.7 kB)
Downloading openpyxl-3.1.5-py2.py3-none-any.whl (250 kB)
Downloading et_xmlfile-2.0.0-py3-none-any.whl (18 kB)
Installing collected packages: et-xmlfile, openpyxl
Successfully installed et-xmlfile-2.0.0 openpyxl-3.1.5

(venv) [user]@[hostname]:~/mypyenv$ pip list
Package    Version
---------- -------
et_xmlfile 2.0.0
openpyxl   3.1.5
pip        25.1.1

Here is sample script to enumerate row and column

import openpyxl
from datetime import datetime

path = "./[replace_with_your_file].xlsx"

# wb_obj = openpyxl.load_workbook(path) # Open xlsx without option
wb_obj = openpyxl.load_workbook(path, data_only=True) # Open xlsx with option Data Only

## object sheet
# using active sheet
#sheet_obj = wb_obj.active
## using Sheet1
sheet_obj = wb_obj["Sheet1"]

# Sheet start from 1,1 not 0,0
# Access cell at row 1, column 1
#cell_obj = sheet_obj.cell(row=1, column=1)
#print("Cell 1 column 1 is ",cell_obj.value)

# to print row 1 column 20
#cell_obj = sheet_obj.cell(row=1, column=20)
#print("Cell 1 column 20 is ",cell_obj.value)

# Enumerate row
# first row is header
# max_row may contains empty row
for i in range (2, sheet_obj.max_row):
    cell_obj = sheet_obj.cell(row=i, column=1)
    if cell_obj.value is not None:
        # enumerate column
        # we need fix column e.q 6 column from 1 to 6
        # Do not use max_column
        # for j in range (1, max_column): # Do not use this, use fix number
        for j in range (1, 6):
            mycell_obj = sheet_obj.cell(row=i, column=j)
            #if mycell_obj is not None: # if you use
max_column, this will not working, only working for row
            print(mycell_obj.value, " | ", end='') # print without new line
        print() # print a new line
    else:
        # the row is empty we break
        print() # print a new line
        break