• 검색 결과가 없습니다.

Multivariate statistical methods for the analysis, monitoring and optimization of processes

N/A
N/A
Protected

Academic year: 2022

Share "Multivariate statistical methods for the analysis, monitoring and optimization of processes"

Copied!
34
0
0

로드 중.... (전체 텍스트 보기)

전체 글

(1)

Multivariate statistical methods for the analysis, monitoring and optimization of

processes

Jay Liu

Dept. of Chemical Engineering

Pukyong National University

(2)

3. Partial Least Squares

We will cover ….

Multiple Linear Regression (least squares) [MLR]

Principal Component Regression [PCR]

Partial Least Squares or Projection to Latent Structures [PLS]

(3)

Quantitative modeling

• Relationships between two sets of multivariate data, X and Y

• In process modeling and optimization

• Process variables yield / quality

• Chemical composition quality

physical measurements biological activity

• Chemical structure reactivity properties

biological activity

• In multivariate calibration

signals (spectra) concentrations

energy contents, etc

(4)

Quantitative modeling

• Starting point

• Data set = tables (matrices) of N rows (objects, samples, …) and K & M columns (variables,

properties, …)

Objects (cases, samples, rows, …)

Analytical samples

Process time points

Trials (experiment runs)

Variables (tags, properties, columns, …)

Sensors (T, P, flow, pH, conc., …)

Spectra, chromatograms, …

quality measures, yields, costs, …

X: what is “always” available Y: what is “not always” available

(5)

• Can’t remember?

• Let’s review engineering statistics

Multiple linear regression

( )

1

ˆ

T T

=

=

Y Xb

B X X X Y

MLR

(6)

Multiple linear regression

• Matrix representation of MLR (M=1)

= +

y Xb e

X =

T = y

[

0 1

]

T

b b bm

=

b

= eT

m+1: number of coefficients n: number of data points

0 1 1 2 2 m m

y = +b b x + b x + + b x + e

11 1

12 2

1 1

1

m m

x x

x x

x x

 

 

 

 

 

 

   

(7)

Multiple linear regression

• Example

– Fitting quadratic polynomials to five data points

x y

2

0 1 2

y = +b b x b x+ +e

= +

y Xb e 1

0 2

1 3

2 4

5

1 1.0 1.0 1.0

1 0.5 0.25 0.5

1 0.0 0.0 0.0

1 0.5 0.25 0.5

1 1.0 1.0 2.0

e

b e

b e

b e

e

 

   

    

=  +  

    

   

  

Three unknowns Five equations

(8)

Multiple linear regression

• Solutions

• 1. LU decomposition or other methods to solve L.A.E

• 2. Matrix inversion

= +

y Xb e Sr =

ei2 = e eT =

(

y Xb

) (

T y Xb

)

Sum of squares of errors

r 0

S

∂b =

(

X X bT

)

= X yT "Ax = b"

(

X X bT

)

= X yT ⇒ =b

(

X XT

)

1 X yT

(

X X bT

)

= X yT

(9)

What if X’s are correlated

• High/no correlation between x

1

and x

2

• What if very small (measurement ) noises added to X

( )

1

1.0000 0.9999 0.9999 1.0000

5000.25 4999.75 4999.75 5000.25

T

T

= 

= 

X X

X X

( )

1

1.0000 0.0 0.0 1.0000 1.0 0.0

0.0 1.0

T

T

= 

= 

X X

X X

( )

1

1.0001 0.9999 0.9999 1.0000

3333.34 3333.11 3333.11 3333.34

T

T

= 

= 

X X

X X

( )

1

1.0000 0.0 0.0 1.0001 0.9999 0.0

0.0 0.9999

T

T

= 

= 

X X

X X

(10)

What if X’s are correlated

If high correlation among columns in X:

unstable solutions for b

• predictions uncertain also

• What to do about it?

Select uncorrelated columns from X

• Other issues:

X has (measurement) error; MLR assumes it doesn't.

• MLR cannot handle missing values

PCR and PLS can avoid these drawbacks.

(11)

What if X’s are correlated

• Geometrically speaking

High correlation between x and x

(12)

Principal component regression

ˆ = Y TB

=

T XP

( )

1

and can be calculated as, B B = T TT T YT

(13)

Principal component regression

Two blocks, X and Y

• Objective: model both X and Y and the relationship between X and Y

X summarized by PC scores (t’s) in matrix T T = XP

• PC scores used as independent variables in MLR

( )

1

ˆ = , where = T T

Y Tb B T T T Y

(14)

Principal component regression

• Building PCR model

Indirect modeling (no direct modeling between x’s and y’s)

• Advantages:

Columns in T are orthogonal

• Can Handle missing values

T has much less error than X

• Less need for variable selection

=

T XP

( )

1

ˆ = , where = T T

Y Tb B T T T Y

(15)

Principal component regression

(16)

Principal component regression

• Using a PCR model: can check consistency before predicting y’s

• Check SPEnew

• Check T2new

• Projection to latent structures (PLS) aka Partial least squares

• better alternative to PCR

• Indirect modeling (inner model and outer model)

• Idea?

Build a regression model between scores of X and Y.

(17)

Projection to Latent Structures (PLS)

(18)

Projection to Latent Structures (PLS)

• PLS

Generalization of PCA to deal with the relationship X  Y

• Advantages over PCR:

• Has a model of Y space.

Can handle correlation in Y.

Assumes there is error both in X and in Y

(19)

Projection to Latent Structures (PLS)

(20)

Projection to Latent Structures (PLS)

Objective function for PCA: best explanation of X- space

• Optimization formulation of PCA

• Objective function for PLS: has 3 parts

1. Best explanation of the X-space 2. Best explanation of the Y-space

3. Maximize relationship between X- and Y-space

PLS Projection of X both approximates X well and

PCA

Projection of X is an optimal

(21)

Review of PCA formulation

• For PCA: best explanation of X-space:

gives greater variance of ta (variance proportional to )

• How do we get the scores?

ta = Xapa

T

t ta a

( )

arg max s.t 1.0

a

T T

a a a a =

p

t t p p

(22)

Back to PLS

1. PLS scores explain X:

ta = Xawa for the X-space

2. PLS scores also explain Y:

ua = Yaca for the Y-space

3. PLS maximizes relationship between X- and Y-space

• How?

( )

max t tTa a subject to w wTa a =1.0

( )

max u uTa a subject to c cTa a =1.0

(23)

PLS: maximize relationship

We have two scores: t

a

and u

a

• ta: summary of the X-space

• ua: summary of the Y-space

• The objective function of PLS:

Maximizes covariance: Cov (ta, ua)

• This actually does three simultaneous things …

(

,

) { ( ) ( ) }

1

a a a a a a

T a a

Cov

N ε

= − −

=

t u t t u u

t u

(24)

PLS: maximize relationship

T T

(25)

PLS: maximize relationship

Maximizing covariance between t

a

and u

a

is actually:

T T

T T

(26)

PLS: geometric interpretation

For each matrix X and Y, we

have K- and M-dimensional space.

Each object is one point in the X- and Y- space.

X and Y are two connected swarm of points in these two spaces.

Mean-centering and scaling:

same as in PCA.

Calculate the average of each variable.

These averages are subtracted from X and Y. And then, scaled to unit variance (usually)

(27)

PLS: geometric interpretation

1st PLS component is a line in X- and Y- spaces, through the average points, such that

1. The lines well approximate the data

2. The projection (t and u ) are well correlated. (see next slide)

(28)

PLS: geometric interpretation

u1

t1

The projected coordinates in the two spaces (u1 and t1 in Y and X are correlated in the inner relation

ui1 = ti1 + hi

(hi us a residual)

Slope = 1.0

(29)

PLS: geometric interpretation

The 2nd PLS component: lines in the X- and Y- spaces, through the average.

The lines in X-space are orthogonal. Lines in the Y-space are not orthogonal.

These lines improve the approximation and the correlation as much as

(30)

PLS: geometric interpretation

u1

t1

Slope = 1.0 u2

t2 Slope = 1.0

The 2nd projection coordinates (u and t ) are correlated, but usually less

(31)

PLS: geometric interpretation

The PLS components together form planes (or hyperplanes) in X and Y- space.

The variability around the X-plane is used to calculate a tolerance interval

SPE(DmodX)

On the model plane (SPE≈0) outside of the usual range

(32)

PLS: geometric interpretation

For a new object,

(1)

(2)

(3)

(33)

Projection to Latent Structures (PLS)

• Summary

K, M: number of X, Y variables

N: number of objects

A: number of PLS components

k(=1,2,…,K), m(=1,2,…,M): indices for X and Y variables

T, U: score matrices of X and Y

P: loading matrix

W: X-weight matrix

C: Y-weight matrix

U

CT PT

WT

(34)

Projection to Latent Structures (PLS)

• Summary

1. Preprocessing 2. PLS projection of data (X and Y) onto hyperplanes

3. Scores, t and u are coordinates in the hyperplanes.

4. Loadings p and weights w and c Define the direction of the

hyperplane.

U

CT PT

WT

( )

'

T T T

= +

= +

= +

= +

= +

X TP E

Y UC F

Y TC F

U T H

Y XB E

참조

관련 문서

 Figuring out the dynamics involved in highway operations including tradeoffs among toll charges, service level,. volume of traffic,

• Chemical reaction; attraction between positive and negative or partial charges. negative,

Where training of the employed or would-be employed workers is directly provided by the employer or commissioned to other institutions, the employment insurance fund may

To achieve the purpose of educational processes, 'expansion of self-leading instructional ability', the research and the development for the methods

Discontinuum Continuum.. and Min, K.B., 2015, Discrete element modeling of transversely isotropic rock applied to foundation and borehole problems, 13 rd ISRM

Kikuchi, "Solutions to shape and topology eigenvalue optimization problems using a homogenization method", International Journal for Numerical Methods in

 The use of cells and biological molecules or cellular and biomolecular processes to solve problems and make

Among the above 4 paths, there [exists or does not exist] at least one non- blocking path and thus D and E are [ dependent or independent ] to each other... Among the above