5 things great data science product managers do

Monday August 12, 2024. 11:00 AM , from InfoWorld

While data science requires experimentation and discovery, deploying data science products and programs to end-users requires similar disciplines as bringing a SaaS to market or a business application to departments. When deploying software, product managers have the critical role of delivering business value and improved user experience, and they have extended responsibilities when working on data science initiatives.

I’m a strong proponent of applying scrum in data science initiatives and assigning product managers to oversee the development of data science products, including data visualizations, machine learning models, and genAI capabilities. By assigning product management responsibilities, organizations align data science initiatives to business strategy and are more likely to develop ongoing enhancements to extendable products rather than experimenting and developing one-off analytics and reports.

How product managers and agile drive business alignment

In The 17th State of Agile Report, nearly 70% of respondents said they use agile methodologies in IT departments and software development. Many organizations have shifted to agile because of issues associated with waterfall project management methodologies, including lack of flexibility and significant upfront planning. According to the survey, nearly 60% of respondents who were satisfied with agile methodology cited increased collaboration and better alignment with business needs as concrete benefits.

A key differentiator in agile is assigning a product owner role and product management function that significantly differ from program and project management responsibilities.

Product owner responsibilities include understanding customer needs, prioritizing capabilities, drafting requirements, and partnering with the scrum team to release delightful customer experiences. Many organizations with multiple teams developing and enhancing products, including customer and employee-facing applications, will assign a higher-level product management role. The product manager’s responsibilities include defining customer personas, drafting vision statements, and creating roadmaps that deliver business value.

While the product management function in data science initiatives has similar objectives to software development and other technology initiatives, several distinct responsibilities are specific to analytics and data products. The following product manager responsibilities are specific to data science initiatives.

5 product manager responsibilities in data science initiatives

Justify the business value of investing in data science

Drive collaboration between data science and devops

Evaluate the optimal data sets for a quality product

Ensure rigorous ML model testing and monitoring

Measure business value and KPIs

Justify the business value of investing in data science

Product managers are instrumental in aligning initiatives with business priorities. Given all the departments, data sets, and data products where organizations can invest in data science capabilities, which are the most important to focus on and why?

“It’s important that product managers stay laser-focused on helping customers and establishing the ‘why’ behind the development of AI/ML models,” says Madeleine Corneli, lead product manager of AI/ML at Exasol. “Asking how the model will help users solve problems, improve experiences, or make them more effective will help ensure the right models are created.”

Machine learning and AI capabilities are evolving so quickly that it’s easy for data science teams to get caught up with trying out the latest tools and methodologies. “With so many potential ML applications, it’s up to product managers to guide teams towards the right AI solutions. Just because you can build something doesn’t mean you should,” adds Corneli.

A key product management discipline is identifying an initiative’s target customer, value proposition, and strategic business value. Business value from data science initiatives often involves improved decision-making capabilities, increased productivity, and sustained competitive advantages. The data science product, including the product’s data visualizations, predictive models, and LLMs, are part of the solution.

“AI is the ‘how’ and not the product, so if using AI doesn’t solve a customer problem, you shouldn’t do it,” says Ibrahim Bashir, VP of product management at Amplitude. “If an AI-driven feature doesn’t positively impact a key business metric, such as time-to-value or retention, it shouldn’t be a priority.”

Karl Mattson, director of security technology strategy at Akamai, says that leading product managers first consider the end state of the user or customer experience and work backward to build the product. He says, “For data science initiatives, the end goal is informing quality decisions. We truly have to understand the nature of the decisions to be made on our data product and not be obsessed first over the technical how.”

Drive collaboration between data science and devops

A recent study shows that only 32% of respondents successfully deploy more than 60% of their machine learning models.

One reason for the low model deployment rate is that data science teams must experiment with many models to optimize solutions. However, another gap is connecting models into workflows, customer experiences, and automations where predictions can be used in decision-making and other data-driven tasks. Developing the integrations and ensuring models are production-ready often requires collaboration with devops teams. Product managers then have to evaluate the effectiveness of the end-to-end product.

“Typically, it is the responsibility of product managers in data science initiatives to facilitate the work between data science teams and software developers to ensure the accuracy and value of the model being built so that it aligns with product goals,” says Rita Priori, CTO at Thirona. “Product managers should have in-depth AI knowledge, and for product lifecycle, have metrics in mind such as accuracy, speed, and adaptability to continuously assess the performance of models and improve upon them.”

A key skill for product managers is the ability to evaluate the product while agile teams, including data scientists and devops engineers, are iteratively developing it. They should ask technical questions about the implementation and its alignment with product objectives. Product managers on data science initiatives must drive a team culture that’s open to raising questions and evaluating solutions, and this requires a foundational understanding of design thinking principles, data operational practices, machine learning model capabilities, AI’s limitations, and software architectures.

“In developing AI and ML models and platforms, product managers must fundamentally understand the technology and work with design, engineering, and research teams to ensure a product addresses real issues,” says Maryam Ashoori, director of product management for watsonx.ai at IBM.

Evaluate the optimal datasets for a quality product

The informational value, quality, and costs of the datasets to be used are part of the supply-chain considerations when developing data science products. Product managers must ensure that the data ingredients yield a quality and affordable product, which requires evaluating sources and overseeing the dataops manufacturing processes.

“Strong foundational datasets are crucial to launch a successful data science initiative,” says Muli Farkas, VP of product management at Glassbox. “This requires significant collaboration between product managers and data scientists to define and refine the dataset to ensure it represents the real-world conditions the model will address. Product managers are the bridge between user needs and technical execution, and through that, must inform data scientists about the features that will enhance model performance in user-valued ways.”

One key area that requires product management oversight is how data scientists select and partition data sets for model training. Mike Flaxman, VP of product management at Heavy.ai, says, “Key tasks include ensuring the right data is used to train models, meaning appropriate volume and proper quality. Additionally, product management needs to be mindful of avoiding implicit biases when selecting the data and conducting the training.”

Product managers should consult with the team’s data scientists and define the minimal data requirements to support the product’s goals, including data quality metrics such as accuracy, consistency, and timeliness. Data scientists should also demonstrate that training data has statistical significance for the targeted data segmentations, and product managers should oversee how the team evaluates data sets for potential biases.

Ensure rigorous ML model testing and monitoring

The hard part of testing LLMs is that they aren’t fully deterministic, and the outputs from regression tests can’t be compared word for word. There are also challenges in validating predictive models, and many machine learning models that work off large-scale feature vectors may not be practical to test across all the dimensions and values, even with synthetic data source options.

Thus, data science products need business input on what use cases to validate, how much to invest in testing, and quantifying when models are production-ready. Product management should serve as the facilitator between legal, risk, security, and other stakeholders to define minimal testing criteria, set a budget for testing, and decide how oversight will be coordinated.

“You need a detailed, extended, and meaningful set of tests because so many things can go wrong when developing a data science application,” says Rosaria Silipo, head of data science evangelism at KNIME. “For example, you could have different data available in production than the data available in the development lab, or you could have experienced data leakage while training the model.”

Silipo recommends that data science teams test after development, before production, and from time to time to determine when there’s model drift.

Corneli of Exasol adds, “Planning ahead about monitoring these models, dashboards, and APIs to ensure uptime and ongoing accuracy are key for long-term success.”

Top product managers recognize the importance of machine learning monitoring and dedicate engineering time before deployment to operationalize the product’s technology components. Organizations operating with site reliability engineering practices will define service level objectives around performance, model quality, data quality, and other errors that impact business operations.

Measure business value and KPIs

Even though businesses are investing heavily in data science, machine learning, and AI, data science teams must deliver business value and communicate impacts in key business metrics to justify ongoing investment. It’s the product manager’s responsibility to communicate where data science initiatives are making strategic and operational impacts.

“The success of an AI product, whether a platform or model, hinges on how well it addresses end-user needs,” says Ashoori of IBM.

Interviewing, surveying, and using other feedback mechanisms can help identify areas for improvement and measure customer satisfaction (CSat).

Flaxman of Heavy.ai shares these additional operational KPIs:

How frequently are models being updated and deployed?

How accurate are the models?

What is the average time to detect issues and anomalies?

How long does it take to resolve an issue once detected?

Kjell Carlsson, head of data science strategy and evangelism at Domino, suggests that data science product managers must be flexible because AI/ML projects and the relevant KPIs vary dramatically. “As a rule of thumb, product managers should track and communicate the KPIs that correspond with how the business measures value, in addition to reporting on model, pipeline, and application performance measures.”

Mar Facio, product and project manager at Xebia, suggests using straightforward business metrics, saying, “Business impact is assessed through the reduction in operational costs and improved user engagement metrics.” She recommends selecting industry-standard metrics for model performance, including accuracy, precision, recall, F1 score, and AUC-ROC for classification tasks.

Product managers shouldn’t expect metrics to communicate insights or stakeholders to be great data science storytellers. Top product managers seek to market their teams’ successes by learning how end-users use data science products to improve decision-making and productivity.

“As the bridge between the data science team and stakeholders, the product manager’s role includes translating technical findings into actionable business strategies,” says Nidhi Shah, product owner at Vericast. “This involves distilling complex data insights into clear and understandable terms for decision-makers, highlighting the importance of clear communication.”

Product managers are responsible for translating business objectives into executable roadmaps and demonstrating the business outcomes. While technology projects always have risks and unknowns, data science initiatives added elements around data quality, model performance, and end-user adoption. Product managers who successfully build and enhance data science products are likelier to advance to larger-scale initiatives and more emergent AI opportunities.