HTML Table Generation with Grouped Data and Custom Column Names using Python
Creating an HTML Table from a Pandas DataFrame
In this article, we’ll explore how to convert a Pandas DataFrame into an HTML table. This process is useful when you want to present data in a web-friendly format. We’ll walk through a sample implementation, breaking down the code step by step.
Step 1: Setting Up Your Environment
Before we start coding, ensure you have Python and Pandas installed. You can install Pandas using pip if you haven't done so:
pip install pandas
Step 2: Creating a Sample DataFrame
Let’s begin by creating a sample DataFrame. This DataFrame simulates some lab data with columns such as Company name, vndr_nm, cmpn_code, Year_Month, and Distinct_Count_of_Records.
import pandas as pd # Sample DataFrame data = { 'Company name': ['xyzcorp', 'xyzcorp', 'xyzcorp', 'abcltd', 'abcltd'], 'vndr_nm': ['xyzcorp', 'xyzcorp', 'xyzcorp', 'abc ltd', 'abc ltd'], 'cmpn_code': [123456, 123456, 123457, 123458, 123458], 'Year_Month': ['April 2024', 'May 2024', 'April 2024', 'May 2024', 'April 2024'], 'Distinct_Count_of_Records': [5, 8, 3, 6, 7] } df = pd.DataFrame(data)
Step 3: Grouping the Data
Next, we group the data by certain columns. This allows us to prepare our data for a cleaner presentation in HTML.
grouped = df.groupby(['Company name', 'vndr_nm', 'cmpn_code'])
Step 4: Setting Up the HTML Structure
We initialize a string to hold our HTML content, including the necessary HTML headers, styles, and an empty table structure.
html = """ <!DOCTYPE html> <html lang="en"> <head> <meta charset="UTF-8"> <meta name="viewport" content="width=device-width, initial-scale=1.0"> <title>Grouped Data and Custom Column Names using Python</title> <style> body { font-family: Arial, sans-serif; margin: 0; padding: 20px; background-color: #f4f4f4; } h2 { text-align: center; margin-bottom: 20px; } table { width: 100%; border-collapse: collapse; background-color: white; } th, td { padding: 10px; text-align: left; border-bottom: 1px solid #ddd; } th { background-color: #4CAF50; color: white; } tr:nth-child(even) { background-color: #f2f2f2; } </style> </head> <body> <h2>Grouped Data and Custom Column Names using Python</h2> <table> <thead> <tr> <th>Company name</th> <th>vndr_nm</th> <th>cmpn_code</th> <th>Year_Month</th> <th>Distinct_Count_of_Records</th> </tr> </thead> <tbody> """
Step 5: Iterating Through the DataFrame
We loop through the DataFrame to create HTML rows. We use a dictionary to keep track of the previous values, ensuring that we only display unique values in grouped columns.
# Iterate through the dataframe to create the HTML rows prev_values = {} # To track the last values in the group to avoid repetition for index, row in df.iterrows(): src_cmpn_nm = row['Company name'] vndr_nm = row['vndr_nm'] cmpn_code = row['cmpn_code'] year_month = row['Year_Month'] distinct_count = row['Distinct_Count_of_Records'] # Only display non-duplicate values in grouped columns src_cmpn_nm_td = src_cmpn_nm if prev_values.get('Company name') != src_cmpn_nm else '' vndr_nm_td = vndr_nm if prev_values.get('vndr_nm') != vndr_nm else '' cmpn_code_td = cmpn_code if prev_values.get('cmpn_code') != cmpn_code else '' # Create HTML row html += f""" {src_cmpn_nm_td} {vndr_nm_td} {cmpn_code_td} {year_month} {distinct_count} """ # Update previous values prev_values['Company name'] = src_cmpn_nm prev_values['vndr_nm'] = vndr_nm prev_values['cmpn_code'] = cmpn_code
Step 6: Closing the HTML Document
After populating the table, we close the HTML tags and finalize our document.
html += """
</tbody>
</table>
</body>
</html>
"""
Step 7: Outputting the HTML
Finally, you can write this HTML content to a file or display it as needed. For example, you could save it to a file:
with open('output.html', 'w') as file:
file.write(html)
Conclusion
This method provides a straightforward way to transform your DataFrame into a visually appealing HTML table. By using Pandas for data manipulation and basic HTML for presentation, you can create reports or dashboards that are easy to share and view.
Final Code
import pandas as pd
# Sample DataFrame
data = {
'Company name': ['xyzcorp', 'xyzcorp', 'xyzcorp', 'abcltd', 'abcltd'],
'vndr_nm': ['xyzcorp', 'xyzcorp', 'xyzcorp', 'abc ltd', 'abc ltd'],
'cmpn_code': [123456, 123456, 123457, 123458, 123458],
'Year_Month': ['April 2024', 'May 2024', 'April 2024', 'May 2024', 'April 2024'],
'Distinct_Count_of_Records': [5, 8, 3, 6, 7]
}
df = pd.DataFrame(data)
grouped = df.groupby(['Company name', 'vndr_nm', 'cmpn_code'])
# Initialize HTML content
html = """
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<title>Grouped Data and Custom Column Names using Python</title>
<style>
body {
font-family: Arial, sans-serif;
margin: 0;
padding: 20px;
background-color: #f4f4f4;
}
h2 {
text-align: center;
margin-bottom: 20px;
}
table {
width: 100%;
border-collapse: collapse;
background-color: white;
}
th, td {
padding: 10px;
text-align: left;
border-bottom: 1px solid #ddd;
}
th {
background-color: #4CAF50;
color: white;
}
tr:nth-child(even) {
background-color: #f2f2f2;
}
</style>
</head>
<body>
<h2>Grouped Data and Custom Column Names using Python</h2>
<table>
<thead>
<tr>
<th>Company name</th>
<th>vndr_nm</th>
<th>cmpn_code</th>
<th>Year_Month</th>
<th>Distinct_Count_of_Records</th>
</tr>
</thead>
<tbody>
"""
# Iterate through the dataframe to create the HTML rows
prev_values = {} # To track the last values in the group to avoid repetition
for index, row in df.iterrows():
src_cmpn_nm = row['Company name']
vndr_nm = row['vndr_nm']
cmpn_code = row['cmpn_code']
year_month = row['Year_Month']
distinct_count = row['Distinct_Count_of_Records']
# Only display non-duplicate values in grouped columns
src_cmpn_nm_td = src_cmpn_nm if prev_values.get('Company name') != src_cmpn_nm else ''
vndr_nm_td = vndr_nm if prev_values.get('vndr_nm') != vndr_nm else ''
cmpn_code_td = cmpn_code if prev_values.get('cmpn_code') != cmpn_code else ''
# Create HTML row
html += f"""
<tr>
<td>{src_cmpn_nm_td}</td>
<td>{vndr_nm_td}</td>
<td>{cmpn_code_td}</td>
<td>{year_month}</td>
<td>{distinct_count}</td>
</tr>
"""
# Update previous values
prev_values['Company name'] = src_cmpn_nm
prev_values['vndr_nm'] = vndr_nm
prev_values['cmpn_code'] = cmpn_code
html += """
</tbody>
</table>
</body>
</html>
"""
My Other Blogs might be interested.
To See all the posts: - Click Here
Nice article. Thanks for sharing.
ReplyDeleteMLOps Course
MLOps Course in Hyderabad
MLOps Online Course
MLOps Online Training
Machine Learning Operations Training
MLOps Training Course
MLOps Training Online