In ML, it may not be easy yet to expect that a model can learn the pattern in data for prediction of unseen data without any help. In this regard, we should present data in the way that the model can do the task easier. It is worth mentioning that selection of good features reduces computational time, improves the accuracy, and allows the model to solve the task with small data set. Therefore, we should pay much attention to feature selection process especially in computational chemistry because of limited data. ML models can use geometrical structure as input to explore high activity catalysts when trained on data set. In building these models, original geometric coordinates are not used directly because neither the translational, rotational, and permutational symmetries nor similarities and differences between structures are explicit without data processing.47 In this regard, here we use Coulomb-matrix,48 as a descriptor for representing atomic geometric structures shown in Eq. (4.3):
Z i j
M Z Z
where Zi is the atomic number of atom i, and Ri is its position. Because Coulomb-matrix is symmetric (half of off-diagonal elements are used), we considered only upper triangular part (the total-sorted elements). Nevertheless, the dimension of Coulomb-matrix is high. It is useful to reduce the dimensional space which is also called dimensionality reduction. The complexity of model increases exponentially with dimensionality. The increased complexity requires more data for training.49
The principal component analysis (PCA) is employed widely for dimensionality reduction. In fact, PCA searches for directions that have maximum variance (containing the most key information about data) in high-dimension space and projects data into fewer dimensions that are statistically uncorrelated.50 The new dimensions are called principal components (orthogonal axes). Here, we use PCA to reduce dimensions of Coulomb-matrix into one axis
(PCA1) for DNN and seven axes (PCA1 to PCA7) for LGBM with 47% and 100% explained variance, respectively.
Figure 4.1. (a) Structures of B-doped graphene with metal single atoms (26 transition metals) considered for screening. (b,c) Side-on (b) and End-on (c) structures for N2 reduction. Color code: metal in purple, boron in green, carbon in brown and nitrogen in light blue.
Figure 4.2. Artificial neural network (10 neurons in each hidden layer) architecture used in this study. The input data is constituted of the optimized geometry which can be one of SACs with adsorbed intermediates of *N2, *N2H,
*NH2 or *NH3. Then, each structural geometry is characterized with seven features: PCA1, χ, Z, R, NC, fB, and NM.
Investigating combination of many features, we found in DNN that the catalytic activity is mainly governed by seven features: PCA1, electro-negativity of TM (χ), atomic number of TM (Z), atomic radius of TM (R), coordination number of TM (NC), fraction of B atoms among coordinating atoms in TM-BxCy-Gr [fB=x/(x+y)], and number of nitrogen atoms adsorbed on TM for side-on and end-on mechanisms in Figure 4.1b-c (NM=2/1). In addition, seven features of PCA1 to PCA7 and eight features of χ, Z, R, Nc, fB, NM, total number of nitrogen atoms (NN), number of hydrogen atoms (NH) altogether can be used for the LGBM regression model to predict adsorption energies and free energies.
4.4.2 Classification by Deep Neural Network (DNN)
NNR process is a highly complicated process that can take place through different steps like distal, alternating and consecutive enzymatic mechanisms51. Each process includes several intermediate states, which requires enormous computing resources for screening. As a result, investigating all steps is inefficient, and so introducing simple descriptors is highly desirable.
On the basis of previous studies, we employ a three-key-step method to select active electrocatalysts for NRR.51-53 The adsorption energy of *N2 on an active site is considered to be more negative than -0.5 eV for efficient catalysts. N2 is an extremely stable molecule due to the inert triple bond, and thus the hydrogenation of *N2 to *N2H requires a large energy. Also,
*NH2 may be stabilized on surface because N has one sp3 half-filled hybrid orbital, and so its free energy (GNH2NH3) is expected to be uphill. Therefore, we considered three key steps proposed by Ling et al.51: EN2 0.50 eV, GN2N H2 0.55 eV and GNH2NH30.7 eV. In this method, EN2is the adsorption energy of N2, GN2N H2 is the hydrogenation free energy of *N2
into *N2H, and GNH2NH3 is the hydrogenation free energy of*NH2 to *NH3. We trained DNN in terms of EN2, GN2N H2 , and GNH2NH3. If one catalyst meets these three requirements, the sample is eligible for good NRR catalysts, otherwise, it is not qualified. Since the pattern of the data is very complicated, only neural networks could be able to classify samples into eligible and non-eligible classes. The trained ANN has the ability to predict eligible catalysts by importing one optimized geometry for the aforementioned three steps as seen in Figure 4.2. In this case, many unimportant DFT calculations are skipped and the catalysts with high probability are selected for prioritizing experimental investigation. To select eligible catalysts, we consider seven features: PCA1, χ, Z, R, NC, fB, and NM. We divided the data (~500) into
two parts: training (85%) and testing (15%). Also, to prevent information leakage for validation of the model (overfitting), the training samples were split into training (75%) and validation parts (25%). As mentioned earlier, the output of DNN is the probability of efficient catalysts.
Samples with probability more than 0.5 are considered as eligible candidates for NRR. The model shows good accuracy (~85%) in predicting eligible candidates.
4.4.3 Regression by Light Gradient Boosting Machine (LightGBM)
The adsorption energies and free energies of some intermediate steps involved in NRR are predicted by using ML, which greatly reduces DFT calculations. For this reason, we used the DFT-fitted model on the basis of fifteen engineering features imported to our model, which are seven features of PCA1 to PCA7 (the features of Coulomb-matrix were reduced to 7 with 100%
explained variance by PCA) as well as eight features of χ, Z, R, Nc, fB, NM, NN, NH. We train the data on several regression models and finally reach the highest accuracy with Light Gradient Boosting Machine (LGBM) model (Figure 4.3, RMSE = 0.11 eV). After training the LGBM model, the relaxed geometry with N2 (Eslab+ads) can be used for adsorption energy of N2 (EadsEslab ads EslabEN2). In addition, as the input to the model, the relaxed geometries of
*N-NH, *NH2 and *NH3 can be used to predictGN2N H2 , GNH2NH3, and GNH3Desorbed, respectively.
Figure 4.3. (a)Artificial Neural network (ANN) predictions based on the test samples, probability predictions versus horizontal axis [X-optional to avoid data coincidence]. (b) Prediction performance plot between DFT- calculations and Machine-Learning outputs.
60 4.4.4 Feature Importance and Correlation
The feature-feature correlation map in Figure 4.4 shows that electronegativity, atomic radius and atomic number of TMs are more important in determining efficient catalysts (DNN model).
Also, the features of electronegativity and atomic number show a highly linear correlation, and they have a close correlation with eligible catalyst (EC). On the other hand, the strong linear correlations of PCA1 with Nc, fB and NM imply compressed features of Column-matrix (PCA1) containing a considerable amount of information. In addition, the weights in neural networks can be used for features importance (linear and nonlinear dependencies) by Mean Decrease in Accuracy (MDA)54. This method also is known as permutation importance which measures the features importance by permutation in out-of-bag (OOB) samples55. The MDA (Table 4.1) shows that NC is the most important features. Correlation between features (for LGBM) in Figure 4.5a (we labeled adsorption energy and free energy as FE). It is natural that all PCA features have no linear correlation with each other. To take into account nonlinear dependency, we used Random-Forest (RF) and Mutual-Information (MI) techniques to obtain the most important feature. The results in Figure 4.5b show that RF predicted principal components as the most important features and the MI obtained R, Z, χ and NH as the highest ranking features.
In addition, we calculated the feature importance by MDA (Table 4.2) which considered NH as the most relevant feature. Among these results, it may be said that NH is the most important feature. However, the total 15 features are essential for our prediction and the high accuracy is obtained by considering all of them.
Table 4.2. Measurement of feature importance by MDA (LGBM).
Table 4.1. Measurement of feature importance by MDA for DNN.
Features Weight Features Weight NH
PCA3 R fB
1.3670±0.4250 0.1543±0.1414 0.1480±0.0630 0.0325±0.0189 0.0254±0.0073 0.0063±0.0209
χ PCA2 PCA1
-0.0107±0.0165 -0.0157±0.0297 -0.0268±0.0145 -0.0540±0.0338 -0.0957±0.0441 -0.1128±0.0373 -0.1354±0.1088
PCA1 Z χ
0.01±0.0128 0.01±0.0197 0.0033±0.0000
0.003±0.0033 0.0013±0.0025 -0.037±0.0065 -0.047±0.0065
Figure 4.4. Feature-feature correlation map (correlation values in %; EC: eligible catalyst).
Figure 4.5. (a)Feature-feature correlation map (correlation values in %; EC: eligible catalyst). (b)The most important ranking features predicted by random forests (RF-blue columns) and mutual information (MI) methods.
62 4.4.5 Adsorption of N2, N2H, NH2, NH3 and H on SACs
The calculated (DFT) results for the first and second steps screening (EN2,GN2N H2 ) are presented in Figure 4.6 which shows only the samples with adsorption energy (EN2) less than -0.5 eV (first step). The results can be divided into two parts: eligible or non-eligible, however, only eligible catalysts (GN2N H2 0.55eV) meet the requirement for the next round of screening.
With this screening, more than half of the samples are filtered, which may control the overall rate of screening for NRR. For the next round, we calculated GNH2NH3versus GN2N H2 , as shown in Figure 4.7. After two-step screening, all data meet the third criteria for NRR catalysts.
For the final analysis, we only considered the samples having GNH3Desorbed0.8eV as the most promising catalysts for NRR. To investigate if SACs can suppress HER and to find out the onset potentials values, we calculated adsorption free energy of H and N2. The HER over NRR selectivity of materials are presented in Figure 4.8. The free energy of HER is more positive than that of N2 for all catalysts except MoB1C2.Therefore, NRR can suppress HER (the optimal catalytic activity appears to be ΔG*H=0 for HER). The SACs illustrate excellent selectivity over HER.
Figure 4.6. First-step screening results of B-doped graphene SACs based on hydrogenation of N2 into N2H and adsorption energy of N2.
Figure 4.7. Third-step screening results of B-doped graphene SACs on the basis of free energy barrier of NH2 to NH3 versus free energy barrier of N2 into N2H.
Figure 4.8. Free energies calculated for H and N2 adsorption which are divided into two regions of ΔG*H < ΔG *N2 (HER dominant) and ΔG*H > ΔG*N2 (NRR dominant).
64 4.4.6 Stability and Free Energy Pathways
The stability of catalysts which plays a critical role for catalytic activity need to be investigated.
By considering stability only three catalysts are suitable for NRR. Figure 4.9 depicts the reaction free energies for the three promising catalysts. The NRR is a six-electron reaction N2+6H++6e-→ 2NH3 which side-on and end-on adsorption geometries are possible for *N2. As seen in Figure 4.9, the first protonation step (*N2+H++e- → *N2H) is the potential determining step (PDS).
Therefore, the first step is the most uphill step for the eligible catalysts. In addition, the end-on configuration shows more feasibility for NRR than side-on configuration. Among best catalysts, only HfB1C2 showed to be more feasible for NRR through side-on mechanism. In addition, it seems that end-on configuration is more favorable for NRR than side-on configuration (by considering all data). Free energy change of the first hydrogenation step for Ru (0001), which has the lowest overpotential for NRR among transition metals, is ~1 eV,13 however, the free energy change of PDS for the B-doped SACs are much lower than 1 eV. After the PDS for most of catalysts, there is no barrier for NRR, however, the free energy change for desorption of NH3 is positive and larger than PDS. Among efficient catalysts, the limiting potentials are: -0.42 V (HfB1C2), -0.44 V (TcB3C1), and -0.29 V (CrB3C1 - the lowest overpotential), which are much lower than those of other known bulk metals.53
In the case of NRR, the TMs should have empty d-orbital to accept lone-pair electrons of N2, while donating electrons to π* orbitals of N-N weakens nitrogen bond and strengthens the binding with N. Boron has three valence electrons which can act for π-backdonation mechanism.43
However, the main role of boron-supported atoms in graphene is to transfer electrons to TM.
Electrons in TM probably increase charge transfer, followed by reduction of reactant. To further understand the role of charge transfer, we computed Bader charge for TMs. It revealed that Tc, Cr, and Hf gained +0.55 e, +1.015e, and +1.73e, respectively. On the other hand, N-adsorbed atoms obtained negative charges. Positive and small negative Bader charges cause large ΔG*H
but small ΔG*N2 for NRR.56
Figure 4.9. Free energy diagrams calculated by DFT for NRR via distal mechanism for a) CrB3C, and b) TcB3C1, consecutive mechanism for c) HfB1C2, on B-doped SACs at zero and applied potentials. The blue and red curves depict free energy changes for NRR at 0 and applied potentials.