QWERTY: How to  build an end-to-end Machine Learning Web App to determine which customers are likely to stop using a bank with the MindsDB Python SDK

QWERTY: How to build an end-to-end Machine Learning Web App to determine which customers are likely to stop using a bank with the MindsDB Python SDK

Introduction

What if I told you there was a way for business owners to not only know the rate at which customers discontinue their services or products but also ascertain a way to control this rate? Well, that way is called churn and it would form the subject for this tutorial. Churn is a measure of customer loyalty and it is a key factor in determining a company's growth and profitability. As you will learn in this tutorial, the advent of Artificial Intelligence (AI) and Machine Learning models, in general, has been revolutionary in helping businesses scale and optimize their processes.

In this tutorial, you will learn how businesses can predict and mitigate churn and retain their customers whilst increasing their bottom line. The use case for this tutorial will involve you predicting the customer churn for a fictional ABC Bank. You will start with the data from a .csv file you will get from Kaggle. Thereafter, you will use MindsDB's cloud AutoML capabilities to build a Machine Learning model from scratch, use the MindsDB Python SDK to integrate it into a web app and employ the Streamlit Python library to build the backend infrastructure and streamline the deployment process to create interactive web applications.

In essence, in this tutorial, you will go from .csv file to a Machine Learning web app that can be used by anyone with internet access.

  • The final web app you will have built and deployed in this tutorial. You can try it for yourself here.

Prerequisites

To follow through with this article, you will require:

  • A basic understanding of Machine Learning and how it works. You can read about Machine Learning models and how they work here.

  • A Kaggle account which you will use to access the data used for this tutorial. Sign up here.

  • A MindsDB account. If you don't have one, sign up for a free account here.

Why MindsDB?

With the current wave of AI in the world right now, imagine a team of problem solvers who lacked the core skills needed to build machine learning models had a business problem that the power of machine learning could solve. In a world without MindsDB, that would be quite the problem. Fortunately, MindsDB, an open-source platform simplifies the process with AutoML capabilities. With MindsDB, as you would see in this tutorial, you can easily analyze data, engineer features, and generate accurate models. Additionally, the platform is flexible enough to customize models to meet specific business needs and support a range of data sources. MindsDB democratizes machine learning, making it accessible to more users and enabling organizations to leverage the power of AI for growth and innovation. Now, let's get to using it.

The Dataset

The dataset used in this tutorial is a simple bi-class classification dataset with ten features that classify the churn likelihood of customers of ABC Multistate Bank into 0 or 1. It is important to note that 1 represents if the client has left the bank during some period or 0 if he/she has not. The columns of the dataset used as input are credit_score which represents the Credit Score of the customer**,** country which represents the Country of Residence of the customer, gender which represents the customer's sex, age which represents the customer's age, tenure which represents how many years the customer has had a bank account in ABC Bank, balance which represents the customer's account balance, products_number which represents the Number of Products from the bank, credit_card which represents if the customer has a credit card, active_member which represents if the customer is an active member of the bank and estimated_salary which represents the customer's estimated salary.

Here is an analysis of the columns of the dataset;

You can download the dataset here.

Step 1: Uploading the data

The first step in this tutorial is to upload your dataset after downloading it for use. To do this;

  • Login to your MindsDB account (since you already have one) and navigate to the Add button on the top left corner and click on it.

  • Click on the Upload file button to open the data upload dialog.

    The data upload dialog allows you to upload files of different formats including .csv, .xlsx, .xls, .json with a maximum size of 500 MB. The dialog takes the form shown below:

  • Click on the Save and Continue button to complete the upload process.

Note:

  1. It is important to be consistent with the Datasource name you use for your dataset. Here we use bank_churn which will later be referenced in our SQL code.

  2. It is important to note that MindsDB boasts a range of integrations that facilitate its compatibility with various data sources. Through these integrations, users can swiftly connect and import data from databases like MongoDB, PostgreSQL, and MySQL as well as cloud-based services such as Google Sheets.

Step 2: Building/Training the Machine Learning Model

In this step, you will build the machine learning model that the uploaded dataset will be trained on. Upon successful upload from the previous step, you will see an interface resembling the one displayed below;

To preview the dataset, you will comment out line 7 to disable it, and then run the code using the Shift and Enter commands or clicking on the Run button on the top left pane. The output displayed at the bottom of the screen will resemble what is shown below. It will return the first ten rows of the dataset.

At this point, you will be building your Machine Learning model using the churn column of the dataset as the label to predict.

CREATE MODEL mindsdb.qwerty_churn_predictor
FROM files
  (SELECT * FROM bank_churn)
PREDICT churn;

In the block of codes above, you use the CREATE MODEL keyword to create a model. Thereafter, you are selecting the file bank_churn (your Datasource name) to use for building the model. Finally, you will use the PREDICT keyword to predict with the churn column (our target variable).

You will be the following output upon running the code;

This predictor may require several minutes to complete training. As such, to track the predictor's progress, you can employ this SQL command:

SELECT status
FROM mindsdb.models
WHERE name='qwerty_churn_predictor';

If the model is still running, you will see the following output:

And when it is done running, you will see the following status message when you run the SQL script:

With the status message currently saying complete, you are done training your Machine Learning model on the input dataset. In the next step, you will learn how to use the MindsDB Python SDK and how it helps you

Step 3: Using the MindsDB Python SDK

In this step, you will be using the MindsDB Python SDK to access and use the Machine Learning model you trained using MindsDB. In general, the MindsDB SKD enables you to connect to a MidnsDB server from Python using HTTP API. The MindsDB Python SDK provides a powerful set of tools and APIs for deploying machine learning models using the MindsDB framework and although it can be used for a range of use cases, in this use case, you will use it along with Streamlit both being Python-based. With the Python SDK, you will connect the model you have built on the MindsDB GUI to your Python app.

To do this, you will start by creating a directory for your project and giving it a meaningful name. I named mine and the app as a whole QWERTY - a name whose significance is no more than being the first six keys in the top-left row of the keyboard. This directory you will create will contain all the files and folders related to your project.

Thereafter, you will have to set up a virtual environment. A virtual environment is an isolated Python environment that allows you to install and manage packages without affecting the global Python installation. To set up a virtual environment, run the following commands in your project directory:

python -m venv env

This will create a virtual environment in a folder called "env" within your project directory. To activate the virtual environment, run the following command:

source env/bin/activate

This will activate the virtual environment and any packages you install will be installed within the virtual environment.

Once you've activated the virtual environment, you can install any necessary dependencies using pip. For this tutorial, if you want to install MindsDB's Python SDK and Streamlit, run the following command:

pip install mindsdb_sdk
pip install streamlit

At this point, you can create a requirements file given that it is a good practice to create a requirements file that lists all the dependencies for your project. To create a requirements file, run the following command:

pip freeze > requirements.txt

This will create a file called "requirements.txt" that lists all the installed packages and their versions.

Thereafter, you will create a new Python script within your project directory using your IDE or code editor. This script will contain the code for your project. Add the following lines of code to your Python file.

import streamlit as st
import mindsdb_sdk
import pandas as pd

# Connecting to MindsDB
server = mindsdb_sdk.connect('https://cloud.mindsdb.com', login='youremail@mailservice.com', password=<YOUR_PASSWORD>)
project = server.get_project("mindsdb")
model = project.list_models()[0] #Selecting the model to use. The index 0 is used because this is the first model in the list of models in my account

With this done, you have created the machine-learning model for your use case and dataset under the variable name, model.

Step 4: Creating a User Interface for the web app with Streamlit and Deploying it using Streamlit

In this step, you will create a user interface for the web app with Streamlit. Since this tutorial focuses on MindsDB as opposed to Streamlit, this tutorial will not go in-depth on what each line of code in this step means. However, you can find a comprehensive explanation of that effect in the Streamlit Official Documentation.

Here is the code you will use to create the user interface on Streamlit.

# Web App title
st.title("QWERTY Churn Predictor")
st.subheader("A Machine Learning App that predicts which customers are likely to stop using a bank")
st.subheader("Enter the Following Details")

# Retrieving Input from user
credit_score = st.sidebar.slider("Credit Score", min_value=350, max_value=850)
country = st.sidebar.radio("Country of Residence", options=["France", "Spain", "Germany"], horizontal=True)
gender = st.sidebar.radio("Gender", options= ["Male", "Female"], horizontal=True)
age = st.sidebar.slider("Age", min_value=18, max_value=100)
tenure = st.sidebar.slider("From how many years he/she is having bank acc in ABC Bank", min_value=0, max_value=10)
balance = st.sidebar.slider("Account Balance", min_value=0, max_value=250000)
products_number = st.sidebar.slider("Number of Products", min_value=1, max_value=4)
credit_card = st.sidebar.slider("Does the customer have a credit card? 0 for no, 1 for yes", min_value=0, max_value=1)
active_member = st.sidebar.slider("Is the customer an active member? 0 for no, 1 for yes", min_value=0, max_value=1)

# Create a dictionary to store value
variables = {
    "credit_score": credit_score,
    "country": country,
    "gender": gender,
    "age": age,
    "tenure": tenure,
    "balance": balance,
    "products_number": products_number,
    "credit_card": credit_card,
    "active_member": active_member
}

# Convert result to Dataframe
result = pd.DataFrame(variables, index=[0])


# Handler to predict the result
if st.button("Predict"):
    if model.predict(result)["churn"].loc[0] == 1:
        st.write(f"This customer is likely to stop using the bank. Please contact them to discuss their account and how they feel about the bank.")
    else:
        st.success(f"This customer is not likely to stop using the bank!")

With this code, you can now the app from the command line by running the command below:

streamlit run "app.py"

It is important to note that we have app.py in this line because the Python file I created is named app.py. However, if you named your file something else, say main.py, the line becomes streamlit run "main.py".

When you run streamlit run "app.py" in the command line, Streamlit will immediately execute the code in the Python file and start a local web server that serves the Streamlit app. The command will return a URL that you can open in a web browser to view the app.

In this example, the app is being served on http://localhost:8501. You can open this URL in a web browser to view the app.

Your app now runs locally!

Step 5: Deploying the web app on Streamlit

While it's useful to be able to deploy and test your Streamlit app locally during the development phase, deploying your app for scaling and production requires a different approach. When your app is deployed locally, it's only accessible from your machine, which means it can't be used by a wider audience. Deploying your app for scaling and production typically involves hosting it on a server that's accessible from anywhere with an internet connection. This allows users to access your app from anywhere in the world, which can greatly increase its reach and impact. Additionally, deploying for scaling and production often requires optimizing your app's performance and stability, since it will be serving a much larger user base. This may involve using cloud computing platforms, load balancing, and other techniques to ensure your app can handle a high volume of traffic and provide a reliable user experience. Overall, while deploying locally is useful for development and testing, it's important to deploy your app for scaling and production to ensure it can reach a wider audience and provide a stable, performant user experience.

In this step, you learn how to deploy your app using Streamlit's app hosting service. Although you can also deploy on other platforms like Heroku and Azure, this tutorial is interested in deploying with Streamlit per its simplicity.

To deploy your app, you will need to add your app directory to GitHub. This is because the Streamlit Community Cloud launches apps directly only from your GitHub repo, so your app code and dependencies need to be on GitHub before you try to deploy the app.

Find the resource to deploy your app here.

Conclusion

In this tutorial, you have seen that building a machine-learning web app to predict customer churn using Mindsdb's Python SDK is a straightforward process that does not require extensive knowledge of machine learning. The MindsDB SDK makes it easy for developers to create models, train them on data, and use them to make accurate predictions. With Mindsdb's auto-ML capabilities, the need for tedious feature engineering and hyperparameter tuning is eliminated, allowing developers to focus on creating an efficient and reliable application. Overall, you have seen using a real-life use case that Mindsdb's Python SDK is a powerful tool that streamlines the development of machine-learning applications whilst making them accessible to a wider audience.

I would like to mention that this article is an entry into the MindsDB Hackathon on Hashnode. I hope you found this article informative and useful. Thank you for reading, and wish me luck in the hackathon! Don't forget to check out the GitHub repository linked here for the full code.