This post is by Francesca Lazzeri and Hong Lu, Data Scientists, and Ilan Reiter, Principal Data Science Manager, at Microsoft.
Introduction and Business Domain
Recent advancements in machine learning and big data technologies are allowing companies to apply better staffing strategies by taking advantage of historical data. Ensuring that the right people get assigned to the right projects is critical not only for the success of a given project but also for the overall profitability of an organization. At most companies, project staffing is typically done manually by project managers, based on staff availability and prior knowledge of individuals’ past performance. This process is not only time-consuming but the results can often be sub-optimal. The same process can be done much more effectively by taking advantage of historical data and advanced ML techniques.
We recently developed a recommendations-based staff allocation solution for Baker Tilly Virchow Krause, LLP, a professional services company. Baker Tilly is a full-service accounting and advisory firm with a focus on industry and services specialization. The solution we have built recommends optimal staff composition as well as individual staff with the right experience and expertise for Baker Tilly’s new projects. By aligning staff experience with project needs, we help project managers at Baker Tilly perform better and faster staff allocation.
The end goal of our solution is to improve profitability at Baker Tilly. Based on our offline evaluation, we expect a 4-5% improvement on profits for the projects that employ our solution. The final solution has been integrated with Baker Tilly’s internal practice management system and will be evaluated in a few pilot teams before being implemented across all teams.
Solution Overview
The solution developed is divided into the following two parts and the outputs of the two parts are combined to generate the final staff recommendations:
Part 1: Predict staff composition, e.g. one senior accountant and two accounting assistants, for a new project. In this part, we use K-Nearest Neighborhood (KNN) to perform prediction based on historical projects with similar properties like project type and industry.
Part 2: Compute Staff Fitness Score (Rating) for a new project. For this second part, we applied a content-based recommendation algorithm developed in R and executed by Execute R Script modules in Azure ML.
The figure below summarizes the workflow design of this Proof of Concept (PoC):
Baker Tilly Staff Allocation PoC Workflow
Solution Architecture
Baker Tilly has integrated the solution within their practice management tools – new projects in Baker Tilly’s database are processed daily by the Azure ML web service and results are consumed by project managers in Baker Tilly’s practice management system. The workforce placement recommendation results are also visualized on a real-time PowerBI dashboard that can be used by data analysts and executives at Baker Tilly to monitor project allocation and performance over time.
The figure below shows the solution architecture in detail:
Baker Tilly Solution Architecture
Experiment Design
Part 1: Predict Project Staff Composition
In this step, we used KNN to predict staff composition (i.e. numbers of each staff classification/title) on a new project, using historical project data. We split the dataset in the following way:
- Training data (historical projects): 2010-09-01 to 2015-09-01.
- Testing data (new projects): 2015-09-01 to Present.
We found historical projects similar to new projects based on project properties such as Project Type, Total Billing, Industry, Client, Revenue Range, etc. We assigned different weights to each project property based on business rules and standards. We also removed any data that had negative contribution margin (profit). For each staff classification, staff count is predicted by computing a weighted sum of similar historical projects’ staff counts of the corresponding staff classification. Using Accountant as an example:
The weight of each historical project is:
The final weights are normalized so that the sum of all weights are 1. Before calculating the weighted sum, we removed 10% outliers with high values and 10% outliers with low values.
Part 2: Predict Staff Fitness Score (Rating) Using Custom Content-Based Filtering
We implemented a content-based algorithm in R to predict how well a staff member’s experience matches a given project’s needs.
In a content-based filtering system, a user profile is usually computed based on the user’s historical ratings on items. This user profile describes the user’s taste and preference. To predict a staff member’s suitability for a new project, we created two staff profile vectors for each staff member, using historical data – one vector is based on the number of hours that describe the staff member’s experience and expertise across different types of projects; the other vector is based on contribution margin per hour (CMH) which describes the staff member’s profitability for different types of projects. Staff Fitness Scores for a new project are computed by taking the inner products between these two staff member profile vectors and a binary vector that describes the important properties of a project. Below is a screenshot of the experiment developed to predict Staff Fitness Score (Rating) using Custom Content-Based Filtering:
Experiment Created Using Azure ML
Offline Evaluation of Staff Recommendation Solution
To evaluate our staff recommendation solutions, we built an ML model (Gradient Boosting Trees) to predict the contribution margin per hour (CMH) of 5,000 randomly sampled projects, using our recommended staff. Based on this offline evaluation, we expect a 4-5% improvement of total contribution margin on projects that employ our solution.
To predict the CMH of a project we used different project features (e.g. Total Hours, Total Billing, Project Type, Industry Group, Revenue Range, State, etc.) and staff features (e.g. number of hours working on Project Type Group, number of hours working on Industry Group, average hourly contribution margin of historical projects of corresponding Project Type Group, average hourly contribution margin of historical projects of corresponding Industry Group).
To train the model, we used project features and staff features of actual allocated staff, and project actual CMH as labels. To evaluate our staff recommendation, we used the trained model to predict project CMH using project features and staff features of recommended staff. Then we compared the predicted CMH with actual project CMH to estimate the potential improvement in CMH using our solution.
The figure below captures the overall workflow of this evaluation approach:
Overall workflow of evaluation approach
The final evaluation metric we used is total contribution margin improvement percentage:
where N is the number of projects.
Note that this offline evaluation relies on two assumptions: First, we assume our recommended staff will always be available. Second, although our model for CMH prediction has 13% absolute error, we didn’t observe any bias in the model. Therefore, the positive improvement we observed is not likely to be the result of prediction error.
Conclusions
In any professional services organization, no resource is more critical or expensive than the human resources they employ. However, the human dimension can often seem too complex to turn into a statistic, with limited opportunities for quantitative observation, refinement or improvement. But what if the human element of an organization can, in fact, be quantified, transformed into real historical data that can then be used and deployed into an advanced analytics and machine learning -based solution?
By making use of a Cortana Intelligence -based framework, we built and deployed a workforce placement recommendation solution that recommends optimal staff composition and individual staff with the right expertise for new projects. Cortana Intelligence, with its collection of cloud-based tools, can help organizations build successful workforce analytics solutions that provide the basis for specific action plans and workforce -based investments. These solutions can address gaps or inefficiencies in organizations’ current staff allocation methods and can help drive better business outcomes. Organizations can gain a competitive edge by using workforce analytics to optimize their use of human capital.
Francesca, Hong & Ilan
You can contact Francesca at lazzeri@microsoft.com or @frlazzeri; Hong at honglu@microsoft.com; and Ilan at ireiter@microsoft.com.