Designed for a data scientist
Simplifying the life of those who collect and process data
pyreports wants to be a library that simplifies the collection of data from multiple sources such as databases, files and directory servers (through LDAP), the processing of them through built-in and customized functions, and the saving in various formats (or, by inserting the data in a database).
Simplifying the life of those who collect and process data
It saves time, without worrying about creating connections or managing files manually.
Each row of code is PEP 8 compliant
Install pyreports is easy with pip.
$ pip install pyreports
I take the data from a database table, filter the data I need and save it in a csv file.
import pyreports
# Select source: this is a DatabaseManager object
mydb = pyreports.manager('mysql', host='mysql1.local', database='login_users', user='dba', password='dba0000')
# Get data
mydb.execute('SELECT * FROM site_login')
site_login = mydb.fetchall()
# Filter data
error_login = pyreports.Executor(site_login)
error_login.filter([400, 401, 403, 404, 500])
# Save report: this is a FileManager object
output = pyreports.manager('csv', '/home/report/error_login.csv')
output.write(error_login.get_data())
In this example we will analyze and capture parts of a web server log. For each error code present in the log, we will create a report that will be inserted in a book, where each sheet will contain the details of the error code. In the last sheet, there will be an element counter for every single error present in the report.
import pyreports
import tablib
# Get apache log data: this is a FileManager object
apache_log = pyreports.manager('log', '/var/log/httpd/error.log')
# Read log based on regexp
data_log = apache_log.read('([(\d\.)]+) - - \[(.*?)\] "(.*?)" (\d+) - "(.*?)" "(.*?)"',
headers=['ip', 'date', 'operation', 'url', 'code', 'client'])
# Create a collection of Report objects
all_apache_error = pyreports.ReportBook(title='Apache error on my site')
# Create a Report object based on error code
all_error = set(data_log['code'])
for code in all_error:
all_apache_error.add(pyreports.Report(data_log, filters=[code], title=f'Error {code}'))
# Count all error code
counter = pyreports.counter(data_log, 'code')
# Append new Report on ReportBook with error code counters
error_counter = tablib.Dataset(counter.values(), headers=counter)
all_apache_error.add(pyreports.Report(error_counter))
# Save ReportBook on Excel
all_apache_error.export('/home/report/apache_log_error_code.xlsx')
pyreports has a command line interface which takes a configuration file in YAML format as an argument.
$ cat car.yml
reports:
- report:
title: 'Red ford machine'
input:
manager: 'mysql'
source:
# Connection parameters of my mysql database
host: 'mysql1.local'
database: 'cars'
user: 'admin'
password: 'dba0000'
params:
query: 'SELECT * FROM cars WHERE brand = %s AND color = %s'
params: ['ford', 'red']
# Filter km
filters: [40000, 45000]
output:
manager: 'csv'
filename: '/tmp/car_csv.csv'
$ report car.yaml
The complete documentation is much more comprehensive and can be found by clicking the button below.
pyreports is free and open source.
The license is GPLv3 and you can consult it directly here: Github License
Donate: PayPal
If you are interested in improvements or have innovation proposals, or simply want to help improve the library, you can fork the project from here! Thank you for your support!
I hope you find this python library useful.
Feel free to get in touch if you have any questions or suggestions.