VevestaX

image

Track failed and successful Machine Learning experiments as well as features.

VevestaX is an open source Python package for ML Engineers and Data Scientists. It includes modules for tracking features sourced from data, feature engineering and variables. The output is an excel file which has tabs namely, data sourcing, feature engineering and modelling. The library can be used with Jupyter notebook, IDEs like spyder or while running the python script through command line. VevestaX is framework agnostic. You can use it with any machine learning or deep learning framework.

How to install the library:

pip install vevestaX

How to import a library and create the object

#import the vevesta Library
from vevestaX import vevesta as v
V=v.Experiment()

How to extract features present in input data.
image
Code snippet:

#read the dataset
import pandas as pd
df=pd.read_csv("salaries.csv")
df.head(2)

#Extract the columns names for features
V.ds=df
# you can also use:
#   V.dataSourcing = df

#Print the feature being used
V.ds

How to extract engineered features
image

Code snippet

#Extract features engineered
V.fe=df  
# you can also use:
V.featureEngineering = df

#Print the features engineered
V.fe

How to track variables used in modelling section of the code. V.start() and V.end() form a code block and can be called multiple times in the code to track variables used within the code block. Any technique such as XGBoost, decision tree, etc can be used within this code block.
image
Code snippet:

#Track variables which have been used for modelling
V.start()
# you can also use:
V.startModelling()


# All the variables mentioned here will be tracked
epochs=100
seed=3
loss='rmse'


#end tracking of variables
V.end()
# or, you can also use :
V.endModelling()

How to dump the features and modelling variables in an given xlsx file
image
Code snippet:

# Dump the datasourcing, features engineered and the variables tracked in a xlsx file
V.dump(techniqueUsed='XGBoost',filename="vevestaDump1.xlsx",message="XGboost with data augmentation was used",version=1)

Alternatively, write the experiment into the default file, vevesta.xlsx
image
Code snippet:

V.dump(techniqueUsed='XGBoost')

A sample output excel file has been uploaded on google sheets. Its url is here

If you liked the library, please give us a github star.

For additional features, explore our tool at Vevesta . For comments, suggestions and early access to the tool, reach out at [email protected]

We at vevesta Labs are maintaining this library and we welcome feature requests. Find detailed blog on the vevestaX on Medium

GitHub

View Github