Portfolio | Koehn AI

Acoustic Monitoring for Predictive Maintenance of Turbines

Wed, 04 Jun 2025 00:00:00 +0000

Acoustic Monitoring and Predictive Maintenance in Turbines

Predictive maintenance leverages advanced data analysis and machine learning to monitor equipment health, enabling early detection of emerging faults before they lead to costly downtime or catastrophic failures. By transforming raw sensor data into actionable insights, organizations can optimize maintenance schedules, reduce unplanned outages, and extend the life of critical assets. This proactive approach lowers operational costs while enhancing safety and reliability in demanding industrial environments.

Client Profile and Context
The client is a leading turbine manufacturer with turbines in service. They commissioned a research project to advance predictive maintenance using microphone-based acoustic monitoring. The goal was to explore unsupervised generative methods that learn reference audio spectra conditioned on operating state, using turbine acoustics and operational parameters, so that real acoustic emissions can be compared with modeled spectra, with a particular emphasis on out-of-domain performance.

Challenge The research team needed to develop generative models that represent normal turbine acoustics without relying on labeled fault data. Acoustic information originated from externally fitted microphones and was available as spectrograms with predefined frequency resolution rather than raw audio, paired with internal machine and ambient parameters. Selecting appropriate conditioning variables, such as inlet guide vane position, load and compressor inlet temperature, was essential to keep generated spectra physically plausible across operating regimes. High noise levels and the absence of labeled anomalies made learning and evaluation difficult. The team therefore pursued a robust unsupervised approach aimed at producing realistic reference spectrograms with out-of-domain generalization under varied turbine conditions, establishing the basis for future anomaly detection.

Our Approach and Solution
We began by conducting extensive exploratory data analysis (EDA) on spectrograms derived from externally fitted microphones. Dimensionality reduction, using Principal Component Analysis (PCA), indicated that inlet guide vane position and compressor inlet temperature were the primary drivers of separable operating domains. Data preparation involved selecting frequency bands with optimal signal-to-noise ratios and restricting records to normal operating conditions, with out-of-domain regimes held out for evaluation. Exponential mapping and logarithmic inversion prevented non-physical spectrogram values. Visualizations in Python with Plotly guided our feature-engineering decisions.

Building on these insights, we developed a Conditional Variational Autoencoder (CVAE) in PyTorch. The CVAE’s convolutional neural network (CNN)–based encoder–decoder architecture was conditioned on key parameters to balance spectrogram reconstruction accuracy and latent-space regularization. Comprehensive grid searches optimized encoder and decoder layer depths, latent-space dimensions, and regularization strength. Iterative training rounds included hypothesis-driven evaluations of in-domain and out-of-domain spectrogram reconstructions. Throughout the project, our team collaborated closely with client experts, reviewing latent-space visualizations and refining conditioning strategies. The final deliverables comprised a trained CVAE for spectrogram generation, a detailed scientific report outlining methodology and findings, and a roadmap for extending the research to real audio anomaly detection (once raw audio is recorded) using pre-trained audio transformers.

We provided

Generative CVAE models trained to reconstruct and generate acoustic spectrograms from microphones under varied turbine conditions.
Comprehensive exploratory analysis and data visualization, enabling clear identification of operational regimes.
Identification and selection of critical machine parameters to ensure robust model conditioning.
A detailed scientific report summarizing methodologies, findings, and actionable recommendations for future anomaly detection.
A strategic roadmap for integrating real audio data once recorded and leveraging state-of-the-art anomalous sound detection methods.

Explainable AI for Neural Networks in Drug Development

Mon, 26 May 2025 00:00:00 +0000

In recent decades, artificial intelligence (AI) and machine learning (ML) have emerged as powerful tools to accelerate drug development, particularly through methods like Quantitative Structure-Activity Relationship (QSAR) modelling, which predict the biological activity of small molecules. However, the inference results of ML models, especially neural networks, are often difficult to interpret for scientists and are frequently referred to as “black boxes”.

To support rational drug design and comply with increasing regulatory demands for model transparency, explainable AI (XAI) methods are now being adopted to provide insights into how ML models make their predictions. In 2024, Koehn AI was approached by a German pharmaceutical company to help integrate the latest advances in XAI techniques into their QSAR neural network Pytorch framework.

Our work focused on evaluating neural-network specific explainability tools such as Integrated Gradients via the Captum library, with more generalizable approaches likes Local Interpretable Model-Agnostic Explanations (LIME) and counterfactual generation using the STONED algorithm in tandem with Self-Referencing Embedded Strings (SELFIES) molecular representations. These methods were benchmarked using real-world activity cliff datasets and evaluated across different model types, including tabular, graph-based and chemical language architectures.

We implemented a robust pipeline that allowed researchers to map predicted molecular activity contributions to individual atoms and molecular fingerprint bits, visualized through RDKit Similarity Maps, offering intuitive, chemically meaningful interpretations. The solution was optimized for performance to ensure seamless and effective integration with the client’s existing workflows and was deployed with their production QSAR platform to support decision making in rational drug design.

We delivered:

Development of an Explainable AI module integrated into the client’s QSAR neural network framework
Evaluation of XAI methods including Captum, LIME, and counterfactuals with SELFIES and STONED algorithms
Support for multiple neural network architectures with visualization with RDKit
Benchmarking with activity cliff datasets to assess relevance and reliability of explanations
Optimization of explainability pipelines for efficient end user interaction in workflows
Deployment of the final module into the client’s production environment supporting informed molecular design

Predictive maintenance and anomaly detection in green energy production

Wed, 09 Apr 2025 00:00:00 +0000

To minimize the downtime cost of critical production devices in the field, a major manufacturer has hired Koehn AI to develop a predictive maintenance system.

When it comes to energy production, any downtime - as short as it may be - comes at a cost. It is therefore essential to detect possible failure points before they happen, so that technicians may preemptively address the fault and minimize costs; all in real-time.

Koehn AI was hired in 2024 by a global manufacturer of solar technology devices to develop an end-to-end predictive maintenance solution, that would allow users to monitor their devices for errors and anomalies. We set out to build a comprehensive end-to-end product: from configuring the devices for data output, to setting up data pipelines and data governance infrastructures, to AI model building and output visualization. In the process, we have leveraged state-of-the-art data analytics and warehousing services: Microsoft Fabric and Databricks instances were set up in Azure with a code-first, infrastructure-as-code approach, upon which ETL pipelines based on Spark structured streaming were deployed. Once access to clean and actionable streaming data had been ensured, the extensive scientific and technical knowledge-base of Koehn AI came into play in the development of custom, use-case-specific algorithms, designed with the internal workings of the devices in mind. The algorithms, which now run on a scheduled basis over the database, output their results to a pleasing visualization dashboard.

What we have provided

A scalable, real-time data pipeline that can handle the growing demands of the client’s expanding device fleet.
A comprehensive, marketable anomaly detection solution as a value-added service to the client’s customers.
A user-friendly dashboard that enables real-time monitoring and alerts, exposed to internal developers.

Statistical Forecast Optimization

Wed, 02 Apr 2025 00:00:00 +0000

Problem Statement

The client leveraged SAP IBP Demand Planning for product demand forecasting. However, major market disruptions—including natural gas shortages, price volatility, surging demand for heat pumps, and dynamic regulatory changes in Germany—introduced substantial instability and severely impacted forecast accuracy. Most product time series showed clear signs of disruption, while the existing forecast setup lacked the robustness to respond effectively. With little preprocessing to smooth out anomalies, the models reacted sharply to erratic short-term spikes resulting in dramatic forecast shifts.

In addition, accessory products—typically sold alongside core products—were forecasted independently, ignoring joint sales dynamics.

Solution Provided

Established essential data preprocessing routines—including outlier detection and correction—to reduce noise and stabilize input signals prior to modeling.
Replaced the existing Best Fit approach with an ensemble forecasting strategy in SAP IBP, combining multiple models to deliver more stable, accurate predictions across all product groups.
Developed a custom Python-based software solution to leverage correlations between core and accessory products, integrating product lifecycle data for more intelligent accessory forecasting.

Implementation & Execution

At project start, the client relied on SAP IBP’s Best Fit algorithm, which selected a single top-performing model (e.g., ARIMA, Exponential Smoothing, Gradient Boosting) based on historical data. We replaced this with a forward-looking ensemble forecasting strategy, combining multiple models into a weighted aggregate to produce more resilient and accurate forecasts. Optimal model weights were derived through time series cross-validation, simulating real-world forecast performance and enabling robust selection across a variety of demand patterns. These weights were periodically recalibrated for core products, accessories, and product segments with pronounced seasonality.

One key improvement involved implementing a robust preprocessing layer. Before feeding time series into forecasting models, we applied targeted outlier correction and sales data smoothing techniques. This reduced the influence of extreme, short-term events on the forecast, leading to more consistent performance and mitigating overreactions to volatile months.

To further enhance forecasting capabilities, we developed a custom Python-based software solution: Core Product Governed Accessory Forecasting (CPGAF). This tool capitalized on the strong correlations between core and accessory products, integrating historical relationship patterns and PLM information to deliver accessory forecasts that are both intelligent and synchronized with the broader product ecosystem.

Beyond model improvements, we provided the client with deep-dive data analytics to uncover key performance drivers—spanning product groups, countries, and forecast granularities—by examining different product hierarchy levels and time intervals.

Results & Impact

Achieved a significant improvement in forecast stability and accuracy, boosting key performance indicators by 15% to 30% across target markets.
Successfully implemented the Python-based CPGAF package, enabling more accurate accessory forecasting driven by core product predictions.
Strengthened forecasting resilience in a volatile environment, positioning demand planning as a strategic, data-informed driver of business value.