RTI 의 개발 및 Validity - 임상연구 문헌 분류도구 및 비뚤임위험 평가도구 개정

주요 위해나 부작용을 평가하기 위해 적절한 통계학적 방법이 쓰였는가

차 해석

적절히 근거에 기반함

연구의 제한점을 고려하여도 결과가 믿을만한가 추출자 이 질문은 전체적 인 연구의 질을 확인하고자 함 연구 결과를 해석하는 능력에 제한을 주는 이슈 를 고려하라

카 발표와 보고

완결성 명확성 그리고 구조 가 확인되었는가

을 구성

등의 연구에 따라 별로 아이템을 분류함

의 명의 스태프로 이루어진 를 소집함 과 들을 확인 조언하고 이전 도구들에 대한 지식을

공유함 에 대해 실시함 멤버들이 아이템이

어떻게 수정되어 해석되고 측정되어야 할지 제공함

명의 잠재적 이용자들이 의 에 참

여함 각각의 아이템의 지시 질문 응답항목이 읽기 쉬운지 독립적으로 답하고 의도한대로 해석되는지 판단함 피드백에 따라 개의 질문을 삭 제함

명의 에 대한

에 참여함 평가자들은 개별 질문을 보고 필수적인지 유용한지 불 필요한지 판단함 평가자들은 코호트 연구 환자 대조군 연구

와 단면연구와 관련하여 네 번씩 반복함

를 통해 각각 질문이 필요한지 확인함 이면 필수 적이라고 고려함 그러나 배제할 질문들을 쉽게 가려내기 어려웠 음

명이 개의 연구에 대해 로 평가함 개의 연구는 로 구성됨 그런데 로 분류된 연구 중 편이 비교군이 없었음 평가자들은 체계적 문헌고찰의 핵심 결과변수 이득 그리고 또는 위해 중요한 교란 변수 분석 프레임의 요약정보를 받음 는 다항 응답 질문만으로 구성함 적용할 수 없는 경우에는 을 함

평균 일치율과 표준편차를 각 연구별 연구간으로 계산하였다 통계는

명인 경우 임 는 검사가 평가자가 검사가 타당 해 보이느냐를 측정하는 것이나 는 직관적인 판단을 요구하는 보다 좀더 엄격한 통계학적인 방법을 요구한다

불일치 에서 일치 로 나타나며 다 평가자간의 일치 확률의 결과 신뢰도에 적합한 검사로 고려됨 명의 평가자 개의 문헌을 고려해 의 를 계산함 도 계산했으나 결과를 나타내지는 않음 높 은 일치율이 낮은 카파 점수로 나타날 수 있는 때문임 평균 점수는 로 낮았으며 모든 아이템에서 평가자간의 평균 일치 퍼센트는 였음

평균시간은 분 소모되었으며 는 분이었음

단계를 거치면서 질문 개수를 줄임 도구의 를 높이고 에 적합하지 않 은 질문들을 삭제함

전체적으로 낮은 평가자간의 점수를 얻었는데 그 이유는 각 연구별로 표 준화된 평가 기준을 개발해놓지 못해서였음 목표는 다양한 범위의 연구에

를 사용하는 경험을 만들고자 하였기 때문에 개의 단일 체계적 문헌고찰 주제

가 아닌 다른 주제에 따라 평가를 해야했기 때문에 결과가 제한적으로 나타났다 는 관찰연구에서 좀 더 잘 나타날 수 있는 을 확립해야 한다 또한 같은 주제에서의 평가자간 신뢰도 측정이나 을 해야 한다 세 번째는 특정 에 대한 응답과 효과값 와의 관계 등을 평가하여 실증적 인 기초를 평가해야 한다

부록 사용자 조사에 대한 연구진 의견 반영 여부

문헌분류도구

비무작위연구 비뚤임 위험 평가도구

부적절한 대상군 선정으로 인해 발생한 선택 비뚤임

교란변수 확인과 고려가 부적절하여 발생한 선택 비뚤임

부적절한 중재 노출 측정으로 인해 발생한 실행 비뚤임

부적절한 결과 평가 눈가림으로 인해 발생한 결과 확인 비뚤임

불완전한 자료를 부적절하게 다루어 발생한 탈락 비뚤임

선택적 결과 보고 때문에 발생한 보고 비뚤임

부록 국외 전문가 의견 및 반영사항

The only thing I recommend is that the authors expand on the domains to which the tool can be applied: Is it only helpful for clinical epidemiology type of study, or also for etiological studies?

도구 설명문 앞부분에 관련 설명 추가함 -

justification for the sample size of 39 non-randomised studies;

despite the considerable amount of work that has gone into this project- this sample is low, the generalisability of the results should be discussed in relation to the sample assessed in more detail (the authors do not this as a limitation, but this does not seem sufficient).

- 개정 도구의 이번 신뢰도 타당도 평가에서는, kappa 관련sample size 계산등을 통해 적절한 문헌수를 도 출하였으며 더 많은 문헌은 평가하여 신뢰도 타당도, , 를 평가하였음

Examples of when NRS studies should be used could be stronger in the introduction.

“Some questions of interest cannot be answered by a review of randomized trials, and some interventions cannot be randomized or are extremely unlikely to be studied in randomized trials. For example, evidence of certain effects, such as long-term and rare outcomes, or outcomes that were not considered important when major randomized trials were conducted, cannot be adequately conducted in randomized trials. In these contexts, review authors may be justified in including non-randomized studies (NRS)”

measurements of intervention;

how would a reviewer assess this item for potential bias in their pooled estimates of effect.

We changed domain name from measurement of intervention/exposure to measurement of exposure

variety of terminology used and

The following text was added to the “limit” pard of Discussion section

the potential use of nested designs, this article does not provide sufficient guidance for a systematic reviewer to implement this tool. This issue is especially confusing in Appendix one as the information is not presented consistently across domains by study design.

Fourth ROBANS is a tool to evaluate non-randomized trials, cohort studies, case-control studies, and before-after studies. When it is applied to the examination of different research designs, the characteristics of these research designs should be considered before deciding on the means of applying RoBANS

Are studies to be measured by outcome or study? This is not clear in this report.

In principle, it is recommended to evaluate studies by outcome.

To clarify this issue, we have made the following corrections to the text:

“Similar to Cochrane ROB, ROBANS is outcome-based checklists. In particular, the domains of blinding of outcome assessments and incomplete outcome data can be treated as outcome-based evaluations.”

the use of language with regards to suitable inter-rater reliability should be revised, the results and conclusion could be deemed misleading, as upon further review the inter-rater reliability is not as impressive (especially when you consider the small sample) as is suggested in the abstract.

Corrections are made in result and dicsussion session.

There is no clear link to the forrest plots in the text, it is uncertain what these plots are and how they contribute to this report without surmise.

An explanation of Figure 1 has been added to the body of the text.

The author suggests that the tool is relatively valid (abstract), but relative to what?

It could be suggested that the

The term, “relatively,” has been deleted from the abstract, and the text of the discussion section has been modified appropriately.

results of this report are overstated in the discussion and the conclusions in contrast to the results found.

“RoBANS shows moderate reliability and acceptable feasibility, validity. The further refinement of this tool and larger validation studies are required.”

부록 도구에 대한 자문 답변서 정리

문헌 분류도구

부록 비무작위연구 비뚤임 위험 평가도구 자문 내용

부록 신뢰도 타당도 평가에서 이용한 일차 문헌 목록

임상연구 문헌 분류도구 및 비뚤임위험 평가도구 개정

발 행 일 년 월

발 행 인 건강보험심사평가원장 발 행 처 건강보험심사평가원 연구 기관 한림대학교 산학협력단 연구책임자 김수영

주 소 서울특별시 서초구 효령로 길 연 락 처

인 쇄 한미프로세스

이 보고서는 건강보험심사평가원에서 시행한 연구용역 사업의 결과물입 니다 이 보고서 내용을 발표할 때에는 반드시 건강보험심사평가원에서 시행한 용역연구개발사업의 연구결과임을 밝혀야 합니다

문서에서 임상연구 문헌 분류도구 및 비뚤임위험 평가도구 개정 (페이지 160-191)