384
Views
0
CrossRef citations to date
0
Altmetric
Computer Science

Creating optimized machine learning pipelines for PV power generation forecasting using the grid search and tree-based pipeline optimization tool

ORCID Icon, ORCID Icon, ORCID Icon, ORCID Icon & ORCID Icon
Article: 2323818 | Received 20 Apr 2023, Accepted 22 Feb 2024, Published online: 15 Mar 2024

Figures & data

Figure 1. Machine learning pipeline structure.

Figure 1. Machine learning pipeline structure.

Figure 2. An example of scheduling the appliances’ working hours in the farm, based on the power generation forecasts given by the created optimal ML pipeline.

Figure 2. An example of scheduling the appliances’ working hours in the farm, based on the power generation forecasts given by the created optimal ML pipeline.

Figure 3. Main components of the farm’s solar power system.

Figure 3. Main components of the farm’s solar power system.

Figure 4. Training process visualization.

Figure 4. Training process visualization.

Figure 5. Dataset splitting strategy for the training and testing processes.

Figure 5. Dataset splitting strategy for the training and testing processes.

Figure 6. The experimental setup.

Figure 6. The experimental setup.

Table 1. Sample of raw data exported from the SPS controller.

Figure 7. Correlation Matrix: Features Pearson Correlation.

Figure 7. Correlation Matrix: Features Pearson Correlation.

Figure 8. Visualization of Adaboost working procedure given a sample of data points. (Raschka et al. Citation2022a).

Figure 8. Visualization of Adaboost working procedure given a sample of data points. (Raschka et al. Citation2022a).

Table 2. Grid Search involved regressors, tuned hyperparameters, and the number of trained regressors and total pipelines.

Figure 9. Machine learning pipeline steps automated by TPOT.

Figure 9. Machine learning pipeline steps automated by TPOT.

Figure 10. NMAE values distribution, grouped by each hour.

Figure 10. NMAE values distribution, grouped by each hour.

Figure 11. ANMAE values distribution for all the experimented pipelines.

Figure 11. ANMAE values distribution for all the experimented pipelines.

Figure 12. PRMSE values distribution for all the experimented pipelines.

Figure 12. PRMSE values distribution for all the experimented pipelines.

Table 3. Details of top and bottom 3 pipelines resulted from the 2 optimization methods: Grid Search and TPOT.

Figure 13. Lowest ANMAE achieved by each regressor type.

Figure 13. Lowest ANMAE achieved by each regressor type.

Figure 14. ANMAE achieved by the best pipeline resulted from each of the 2 optimization methods.

Figure 14. ANMAE achieved by the best pipeline resulted from each of the 2 optimization methods.

Table 4. Details of best pipeline resulted from each of the 2 optimization methods: Grid Search and TPOT.

Supplemental material

Cogent_revised.zip

Download Zip (2.5 MB)

revisedCOGENT.bbl

Download (6.9 KB)

interact.cls

Download (23.8 KB)

Data availability statement

Data are available on request from the authors.