
Ken Kehoe wrote a nice blog post about machine learning and product management. My favorite section is where he distilled the key responsibilities of a PM on machine learning projects, which I’ve quoted here:
What PMs Do on Machine Learning Teams
Step 1: Identifying the problem - Identify the business and product objectives and criteria for success.
- PM Obligation: High. This is your key responsibility — make sure that the problem you’re trying to solve (problem statement), the reason you’re trying to solve it (business case) and the measures for success (KPIs) are crystal clear to the team.
Step 2: Gathering & cleaning the data — Identify the right data sets, verify their quality, and format / clean / combine as necessary.
- PM Obligation: Medium. In order to identify the right data sets you’ll need to brainstorm predictive features for your target outcome. This is typically guided by domain knowledge, human insight, and common sense. PM and Eng benefit from partnering on this, but the engineer will do the heavy lifting.
Step 3: Feature engineering— Create necessary derived columns from the data and identify trends / outliers.
- PM Obligation: Medium / Low. Related to the above, a model’s predictive accuracy can be improved by performing transformations on the data (deriving ratios, raising to a power, etc.). Think critically about whether these types of transformations make sense for your data. An experienced data scientist will ideally take the lead here, with your input.
Step 4: Building the data model— Select appropriate model, train it on the sample data, fine-tune for out-of-sample accuracy.
- PM Obligation: Low. Model selection and optimization should be handled by the data scientist — ask questions if you’re curious.
Step 5: Testing and QA— Observe the model output / accuracy on out-of-sample data and refine as needed
- PM Obligation: High. As PM, you’re likely accountable for the success / failure of the project, so you must determine whether you can ship a model based on its behavior. Ask for a sample output sheet that demonstrates the model’s behavior in a variety of scenarios. If it isn’t doing what you want it to, sit down with the data scientist and highlight examples where the behavior isn’t acceptable, and ask them to explain how they will address these issues on the next pass.
Step 6: Launching & testing — Productionize the model and see if it actually works
- PM Obligation: High / Medium. Unless you have a counterpart in Analytics, it’s likely your responsibility to design a test plan, ensure proper tracking is in place, and analyze the results.