ABSTRACT
Quantifying gas solubility in oil under reservoir conditions is essential in reservoir engineering. Several laboratory techniques have been suggested to quantify it; nevertheless, these procedures are characterized by their demanding time requirements and high costs. Estimating the solution gas-oil ratio relies significantly on applying mathematical and artificial intelligence models. The research commenced by collecting 757 datasets from published articles. Subsequently, the data was partitioned into two distinct categories: training and test data. Following this data preparation phase, artificial intelligence models, including Histogram Gradient Boosting Regression (HGBoost), Adaptive Boosting (AdaBoost), Support Vector Regression (SVR), and the Adaptive Neuro-Fuzzy Inference System (ANFIS), were employed to model the solution gas-oil ratio. Additionally, the Memetic algorithm (MA) was employed to recompute the coefficients of the Petrosky and Farshad model, initially developed with 81 data points, using an expanded dataset of 605 training instances. Subsequently, model accuracy and performance were assessed through statistical and visual analyses. The results indicate that the HGBoost model performs best, achieving an overall R2 of 0.9886 and an overall RMSE of 41.29 scf/STB. Notably, among the mathematical models, the modified Petrosky and Farshad model stands out with an overall R2 of 0.9078 and an overall RMSE of 114.49 scf/STB. The Taylor diagram and violin plot both indicate that the HGBoost model and the modified Petrosky and Farshad model outperform other machine learning models and mathematical models, respectively. These results demonstrate the accuracy of the models described in this study.
Nomenclature
AdaBoost | = | Adaptive Boosting |
AI | = | Artificial Intelligence |
ANFIS | = | Adaptive Neuro-Fuzzy Inference System |
ANN | = | Artificial Neural Network |
API | = | API oil gravity, (API) |
DT | = | Decision Tree |
FIS | = | Fuzzy Inference System |
GEP | = | Gene-Expression Programming |
HGBoost | = | Histogram Gradient Boosting Regression |
LSSVM | = | Least Square Support Vector Machine |
MA | = | Memetic algorithm |
MAD | = | Mean Absolute Deviation |
MF | = | Membership Function |
ML | = | Machine Learning |
MLP | = | Multi-Layer Perceptron |
N | = | Count of data records |
O | = | Output |
= | Bubble point pressure, (psi) | |
PSO | = | Particle Swarm Optimization |
PVT | = | Pressure-volume-temperature |
= | Coefficient of determination | |
RBF | = | Radial Basis Function |
RMSE | = | Root mean-square error |
Rs | = | Solution gas-oil ratio, |
SAA | = | Simulated Annealing Algorithm |
SG | = | Gas Specific Gravity |
SI | = | Scatter Index |
SVM | = | Support Vector Machine |
SVR | = | Support Vector Regression |
T | = | Temperature, |
TSK | = | Takagi-Sugeno-Kang |
= | Weight | |
= | Gas Specific Gravity | |
= | Oil Specific Gravity |
Disclosure statement
No potential conflict of interest was reported by the author(s).
Data availability statement
Data will be obtainable upon request.
Author contributions
H.Y.: Conceptualization, Methodology, Investigation, Software, Data curation, Visualization, Writing-Original Draft, Writing-Review & Editing, Validation.
Additional information
Correspondence and requests for materials should be addressed to H.Y.
Acknowledgments
The author acknowledges ChatGPT version 3.5 and Grammarly for enhancing the text quality through paraphrasing in this article. The original text of the essay was written based on scientific sources and subsequently rephrased using ChatGPT version 3.5 and Grammarly to enhance its quality.