In ML, it may not be easy yet to expect that a model can learn the pattern in data for prediction
of unseen data without any help. In this regard, we should present data in the way that the
model can do the task easier. It is worth mentioning that selection of good features reduces
computational time, improves the accuracy, and allows the model to solve the task with small
data set. Therefore, we should pay much attention to feature selection process especially in
computational chemistry because of limited data. ML models can use geometrical structure as
input to explore high activity catalysts when trained on data set. In building these models,
original geometric coordinates are not used directly because neither the translational, rotational,
and permutational symmetries nor similarities and differences between structures are explicit
without data processing.^{47} In this regard, here we use Coulomb-matrix,^{48} as a descriptor for
representing atomic geometric structures shown in Eq. (4.3):

0.5 _{i}2.4

*i* *j*

*ij*

*i* *j*

*Z* *i* *j*

*M* *Z Z*

*i* *j*

*R* *R*

(4.3)

where Zi is the atomic number of atom i, and Ri is its position. Because Coulomb-matrix is
symmetric (half of off-diagonal elements are used), we considered only upper triangular part
(the total-sorted elements). Nevertheless, the dimension of Coulomb-matrix is high. It is useful
to reduce the dimensional space which is also called dimensionality reduction. The complexity
of model increases exponentially with dimensionality. The increased complexity requires more
data for training.^{49}

The principal component analysis (PCA) is employed widely for dimensionality reduction. In
fact, PCA searches for directions that have maximum variance (containing the most key
information about data) in high-dimension space and projects data into fewer dimensions that
are statistically uncorrelated.^{50} The new dimensions are called principal components
(orthogonal axes). Here, we use PCA to reduce dimensions of Coulomb-matrix into one axis

57

(PCA1) for DNN and seven axes (PCA1 to PCA7) for LGBM with 47% and 100% explained variance, respectively.

**Figure 4.1. **(a) Structures of B-doped graphene with metal single atoms (26 transition metals) considered
for screening. (b,c) Side-on (b) and End-on (c) structures for N2 reduction. Color code: metal in purple,
boron in green, carbon in brown and nitrogen in light blue.** **

**Figure 4.2. **Artificial neural network (10 neurons in each hidden layer) architecture used in this study. The input
data is constituted of the optimized geometry which can be one of SACs with adsorbed intermediates of ^{*}N2, ^{*}N2H,

*NH2 or ^{*}NH3. Then, each structural geometry is characterized with seven features: PCA1, χ, Z, R, NC, fB, and NM.** **

**(a) ** **(b) **

**(c) **

58

Investigating combination of many features, we found in DNN that the catalytic activity is mainly governed by seven features: PCA1, electro-negativity of TM (χ), atomic number of TM (Z), atomic radius of TM (R), coordination number of TM (NC), fraction of B atoms among coordinating atoms in TM-BxCy-Gr [fB=x/(x+y)], and number of nitrogen atoms adsorbed on TM for side-on and end-on mechanisms in Figure 4.1b-c (NM=2/1). In addition, seven features of PCA1 to PCA7 and eight features of χ, Z, R, Nc, fB, NM, total number of nitrogen atoms (NN), number of hydrogen atoms (NH) altogether can be used for the LGBM regression model to predict adsorption energies and free energies.

**4.4.2 Classification by Deep Neural Network (DNN) **

NNR process is a highly complicated process that can take place through different steps like
distal, alternating and consecutive enzymatic mechanisms^{51}. Each process includes several
intermediate states, which requires enormous computing resources for screening. As a result,
investigating all steps is inefficient, and so introducing simple descriptors is highly desirable.

On the basis of previous studies, we employ a three-key-step method to select active
electrocatalysts for NRR.^{51-53} The adsorption energy of *N2 on an active site is considered to
be more negative than -0.5 eV for efficient catalysts. N2 is an extremely stable molecule due to
the inert triple bond, and thus the hydrogenation of *N2 to *N2H requires a large energy. Also,

*NH2 may be stabilized on surface because N has one sp^{3} half-filled hybrid orbital, and so its
free energy (GNH_{2}_{}NH_{3}) is expected to be uphill. Therefore, we considered three key steps
proposed by Ling et al.^{51}: E_{N}_{2} 0.50 eV, G_{N}_{2}_{}_{N H}_{2} 0.55 eV and G_{NH}_{2}_{}_{NH}_{3}0.7 eV. In this
method, EN_{2}is the adsorption energy of N2, GN_{2}_{}N H_{2} is the hydrogenation free energy of *N2

into *N2H, and GNH_{2}_{}NH_{3} is the hydrogenation free energy of*NH2 to *NH3. We trained DNN
in terms of EN_{2}, GN_{2}_{}N H_{2} , and GNH_{2}_{}NH_{3}. If one catalyst meets these three requirements, the
sample is eligible for good NRR catalysts, otherwise, it is not qualified. Since the pattern of the
data is very complicated, only neural networks could be able to classify samples into eligible
and non-eligible classes. The trained ANN has the ability to predict eligible catalysts by
importing one optimized geometry for the aforementioned three steps as seen in Figure 4.2. In
this case, many unimportant DFT calculations are skipped and the catalysts with high
probability are selected for prioritizing experimental investigation. To select eligible catalysts,
we consider seven features: PCA1, χ, Z, R, NC, fB, and NM. We divided the data (~500) into

59

two parts: training (85%) and testing (15%). Also, to prevent information leakage for validation of the model (overfitting), the training samples were split into training (75%) and validation parts (25%). As mentioned earlier, the output of DNN is the probability of efficient catalysts.

Samples with probability more than 0.5 are considered as eligible candidates for NRR. The model shows good accuracy (~85%) in predicting eligible candidates.

**4.4.3 Regression by Light Gradient Boosting Machine (LightGBM) **

The adsorption energies and free energies of some intermediate steps involved in NRR are predicted by using ML, which greatly reduces DFT calculations. For this reason, we used the DFT-fitted model on the basis of fifteen engineering features imported to our model, which are seven features of PCA1 to PCA7 (the features of Coulomb-matrix were reduced to 7 with 100%

explained variance by PCA) as well as eight features of χ, Z, R, Nc, fB, NM, NN, NH. We train
the data on several regression models and finally reach the highest accuracy with Light
Gradient Boosting Machine (LGBM) model (Figure 4.3, RMSE = 0.11 eV). After training the
LGBM model, the relaxed geometry with N2 (Eslab+ads) can be used for adsorption energy of
N2 (EadsEslab ads_{} EslabEN_{2}). In addition, as the input to the model, the relaxed geometries of

*N-NH, *NH2 and *NH3 can be used to predictGN_{2}_{}N H_{2} , GNH_{2}_{}NH_{3}, and GNH_{3}_{}Desorbed,
respectively.

**Figure 4.3. **(a)Artificial Neural network (ANN) predictions based on the test samples, probability predictions
versus horizontal axis [X-optional to avoid data coincidence]. (b) Prediction performance plot between DFT-
calculations and Machine-Learning outputs. ** **

60
**4.4.4 Feature Importance and Correlation **

The feature-feature correlation map in Figure 4.4 shows that electronegativity, atomic radius and atomic number of TMs are more important in determining efficient catalysts (DNN model).

Also, the features of electronegativity and atomic number show a highly linear correlation, and
they have a close correlation with eligible catalyst (EC). On the other hand, the strong linear
correlations of PCA1 with Nc, fB and NM imply compressed features of Column-matrix (PCA1)
containing a considerable amount of information. In addition, the weights in neural networks
can be used for features importance (linear and nonlinear dependencies) by Mean Decrease in
Accuracy (MDA)^{54}. This method also is known as permutation importance which measures the
features importance by permutation in out-of-bag (OOB) samples^{55}. The MDA (Table 4.1)
shows that NC is the most important features. Correlation between features (for LGBM) in
Figure 4.5a (we labeled adsorption energy and free energy as FE). It is natural that all PCA
features have no linear correlation with each other. To take into account nonlinear dependency,
we used Random-Forest (RF) and Mutual-Information (MI) techniques to obtain the most
important feature. The results in Figure 4.5b show that RF predicted principal components as
the most important features and the MI obtained R, Z, χ and NH as the highest ranking features.

In addition, we calculated the feature importance by MDA (Table 4.2) which considered NH as the most relevant feature. Among these results, it may be said that NH is the most important feature. However, the total 15 features are essential for our prediction and the high accuracy is obtained by considering all of them.

**Table 4.2.** Measurement of feature importance by MDA (LGBM).

**Table 4.1.** Measurement of feature importance by
MDA for DNN.

**Features ** **Weight ** **Features ** **Weight **
**N****H**

NC

PCA3 R fB

PCA4 NN

PCA5

**1.3670±0.4250 **
0.1543±0.1414
0.1480±0.0630
0.0325±0.0189
0.0254±0.0073
0.0063±0.0209

0±0.0000 -0.0020±0.0768

PCA6 NM

Z PCA7

χ PCA2 PCA1

-0.0107±0.0165 -0.0157±0.0297 -0.0268±0.0145 -0.0540±0.0338 -0.0957±0.0441 -0.1128±0.0373 -0.1354±0.1088

**Features** **Weight**

**N****C**

R fB

NM

PCA1 Z χ

**0.01±0.0128 **
0.01±0.0197
0.0033±0.0000

0.003±0.0033 0.0013±0.0025 -0.037±0.0065 -0.047±0.0065

61

**Figure 4.4. Feature-feature correlation map (correlation values in %; EC: eligible **
**catalyst). **

**Figure 4.5. **(a)Feature-feature correlation map (correlation values in %; EC: eligible catalyst). (b)The most important
ranking features predicted by random forests (RF-blue columns) and mutual information (MI) methods.

62
**4.4.5 Adsorption of N****2****, N****2****H, NH****2****, NH****3**** and H on SACs **

The calculated (DFT) results for the first and second steps screening (E_{N}_{2},G_{N}_{2}_{}_{N H}_{2} ) are
presented in Figure 4.6 which shows only the samples with adsorption energy (E_{N}_{2}) less than
-0.5 eV (first step). The results can be divided into two parts: eligible or non-eligible, however,
only eligible catalysts (^{G}_{N}_{2}_{N H}_{2} ^{0.55}eV) meet the requirement for the next round of screening.

With this screening, more than half of the samples are filtered, which may control the overall
rate of screening for NRR. For the next round, we calculated G_{NH}_{2}_{}_{NH}_{3}versus G_{N}_{2}_{}_{N H}_{2} , as
shown in Figure 4.7. After two-step screening, all data meet the third criteria for NRR catalysts.

For the final analysis, we only considered the samples having GNH_{3}_{}Desorbed0.8eV as the most
promising catalysts for NRR. To investigate if SACs can suppress HER and to find out the
onset potentials values, we calculated adsorption free energy of H and N2. The HER over NRR
selectivity of materials are presented in Figure 4.8. The free energy of HER is more positive
than that of N2 for all catalysts except MoB1C2.Therefore, NRR can suppress HER (the optimal
catalytic activity appears to be ΔG*H=0 for HER). The SACs illustrate excellent selectivity over
HER.

**Figure 4.6.** First-step screening results of B-doped graphene SACs based on hydrogenation of
N2 into N2H and adsorption energy of N2.

63

**Figure 4.7. **Third-step screening results of B-doped graphene SACs on the basis of free energy barrier
of NH2 to NH3 versus free energy barrier of N2 into N2H.** **

**Figure 4.8.** Free energies calculated for H and N2 adsorption which are divided into two regions of
ΔG*H < ΔG *N2 (HER dominant) and ΔG*H > ΔG*N2 (NRR dominant).

64
**4.4.6 Stability and Free Energy Pathways **

The stability of catalysts which plays a critical role for catalytic activity need to be investigated.

By considering stability only three catalysts are suitable for NRR. Figure 4.9 depicts the
reaction free energies for the three promising catalysts. The NRR is a six-electron reaction
N2+6H^{+}+6e^{-}→ 2NH3 which side-on and end-on adsorption geometries are possible for *N2. As
seen in Figure 4.9, the first protonation step (^{*}N2+H^{+}+e^{- }→ ^{*}N2H) is the potential determining
step (PDS).

Therefore, the first step is the most uphill step for the eligible catalysts. In addition, the end-on
configuration shows more feasibility for NRR than side-on configuration. Among best
catalysts, only HfB1C2 showed to be more feasible for NRR through side-on mechanism. In
addition, it seems that end-on configuration is more favorable for NRR than side-on
configuration (by considering all data). Free energy change of the first hydrogenation step for
Ru (0001), which has the lowest overpotential for NRR among transition metals, is ~1 eV,^{13}
however, the free energy change of PDS for the B-doped SACs are much lower than 1 eV. After
the PDS for most of catalysts, there is no barrier for NRR, however, the free energy change for
desorption of NH3 is positive and larger than PDS. Among efficient catalysts, the limiting
potentials are: -0.42 V (HfB1C2), -0.44 V (TcB3C1), and -0.29 V (CrB3C1 - the lowest
overpotential), which are much lower than those of other known bulk metals.^{53}

In the case of NRR, the TMs should have empty d-orbital to accept lone-pair electrons of N2,
while donating electrons to π^{* }orbitals of N-N weakens nitrogen bond and strengthens the
binding with N. Boron has three valence electrons which can act for π-backdonation
mechanism.^{43}

However, the main role of boron-supported atoms in graphene is to transfer electrons to TM.

Electrons in TM probably increase charge transfer, followed by reduction of reactant. To further understand the role of charge transfer, we computed Bader charge for TMs. It revealed that Tc, Cr, and Hf gained +0.55 e, +1.015e, and +1.73e, respectively. On the other hand, N-adsorbed atoms obtained negative charges. Positive and small negative Bader charges cause large ΔG*H

but small ΔG*N2 for NRR.^{56}

65

**Figure 4.9.** Free energy diagrams calculated by DFT for NRR via distal mechanism for a) CrB3C, and
b) TcB3C1, consecutive mechanism for c) HfB1C2, on B-doped SACs at zero and applied potentials. The
blue and red curves depict free energy changes for NRR at 0 and applied potentials.** **

**(b) **
**(a) **

**(c) **

66