Abstract
In the field of mineral processing, controlling the paste thickener is a highly challenging and critical task because of the high complexity, incomplete observation space, and excessive environmental noises. In this article, we propose an offline-data-driven controlling strategy to optimize the operational indices in the thickening system based on offline reinforcement learning (RL). Compared to common RL methods that rely on online interactive training, our approach ensures the safety of the production process by training the controller solely using offline datasets, thereby avoiding dangerous online exploration. In terms of offline dataset collection, this study utilizes the prior knowledge of the thickening mechanism to design a proportional-integral-derivative controller as the behavior policy to collect operational trajectories as the offline dataset. In addition, to tackle a critical issue in controlling the thickening system: constrained observation space, this article analyzes the dynamical properties of the thickening system and introduces a novel offline RL algorithm, temporal batch-constrained Q-learning (TBCQ). The algorithm and associated model framework are specifically developed for controlling partially observed Markov decision processes. The TBCQ and trained policy are evaluated in both a simulated thickening environment and a real industrial paste thickener in a copper mine. The real-world experiments demonstrate that the proposed controller outperforms the baselines and effectively reduces the tracking error of underflow concentration by over 12%. The successful application of our pipeline in paste thickener also offers an innovative perspective on addressing optimization problems in complex industrial systems: performing offline RL on a dataset sampled from a suboptimal policy.
| Original language | English |
|---|---|
| Pages (from-to) | 49-59 |
| Number of pages | 11 |
| Journal | IEEE Transactions on Industrial Informatics |
| Volume | 21 |
| Issue number | 1 |
| DOIs | |
| Publication status | Published - Jan 2025 |
Keywords
- Intelligent control
- offline reinforcement learning (RL)
- partially observed Markov decision processes (POMDP)
- paste thickener
ASJC Scopus subject areas
- Control and Systems Engineering
- Information Systems
- Computer Science Applications
- Electrical and Electronic Engineering