You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: paper/clinicalcodes.tex
+11-11Lines changed: 11 additions & 11 deletions
Original file line number
Diff line number
Diff line change
@@ -122,13 +122,13 @@ \section*{Reporting of codes in the current literature}
122
122
123
123
A large component of total EMR research is made up by primary care database (PCD) studies and UK PCDs are among the most researched in the world. Figure \ref{figure1_articles_per_year} shows that research outputs with UK PCDs appears to be increasing at an exponential rate, while figure \ref{figure2_PCD_map} shows that research using UK PCDs is being conducted in universities, pharmaceutical companies and research hospitals around the world, and is not just limited to the UK. As one of the largest and most important resources for EMR-based research, it seems reasonable to expect reporting of code lists in UK PCD-based studies to be at least as comprehensive as in other EMR studies. To evaluate levels of transparency in the reporting of clinical code lists, we took a representative sample of UK PCD studies and assessed each study on its extent of reporting of the clinical codes used.
124
124
125
-
We took a sample of 450 papers from the original 1359 identified from a PubMed search. Of these, 392 (87\%) had both the full text accessible to the University of Manchester library and were examples of primary PCD research. Only 35 (9\% of 392) studies published the entire set of clinical codes needed to reproduce the study (usually in an online appendix), while only an additional 47 (12\% of 392) stated explicitly that the clinical codes are available upon request \ref{tab:table1_percentages}.
125
+
We took a sample of 450 papers from the original 1359 identified from a PubMed search. Of these, 392 (87\%) had both the full text accessible to the University of Manchester library and were examples of primary PCD research. Only 35 (9\% of 392) studies published the entire set of clinical codes needed to reproduce the study (usually in an online appendix), while only an additional 47 (12\% of 392) stated explicitly that the clinical codes are available upon request (table \ref{tab:table1_percentages}).
126
126
127
127
128
128
\section*{The need for transparency in clinical code usage}
129
129
130
130
131
-
We identify four main consequences of lack of transparency of clinical code lists. First, if code lists are not made available or not published alongside the primary research using them, they represent an important part of a study methodology that is not subject to scrutiny or peer review. In the extreme case, there is no way of assessing the validity of the diagnosis definition used in a study and clinical decisions could be based on invalid results derived from an incorrect patient base. This could happen despite rigorous downstream statistical analysis. Second, the effective replication of EMR studies is dependent on the availability of the clinical codes from the original study. If all of the codes are not available, it is impossible to tell if differences found in study replications are due to artefactual differences in code lists or if they are genuine. Third, if code-lists are unknown, comparisons between studies addressing the same clinical question are potentially invalidated. Condition definitions change over time and GP coding practice may also change with respect to regulations and incentives \cite{Hippisley-Cox2006}. Also, different studies may use different types of codes for a condition; some studies, for example, include medication and monitoring codes as part of their definition of a patient with diabetes (e.g. \cite{Mulnier2006}) while others do not (e.g. \cite{Kontopantelis2014}). Not having access to code-lists means that it is difficult to know whether fair comparisons are being made between studies. Fourth, building code lists is a time consuming process; having access to historical code lists would mean that new lists could be built incrementally and iteratively, saving much `reinvention of the wheel' while increasing consistency, and potentially accuracy, of definitions across studies.
131
+
We identify four main consequences of lack of transparency of clinical code lists. First, if code lists are not made available or not published alongside the primary research using them, they represent an important part of a study methodology that is not subject to scrutiny or peer review. In the extreme case, there is no way of assessing the validity of the diagnosis definition used in a study and clinical decisions could be based on invalid results derived from an incorrect patient base. This could happen despite rigorous downstream statistical analysis. Second, the effective replication of EMR studies is dependent on the availability of the clinical codes from the original study. If all of the codes are not available, it is impossible to tell if differences found in study replications are due to artifactual differences in code lists or if they are genuine. Third, if code-lists are unknown, comparisons between studies addressing the same clinical question are potentially invalidated. Condition definitions change over time and GP coding practice may also change with respect to regulations and incentives \cite{Hippisley-Cox2006}. Also, different studies may use different types of codes for a condition; some studies, for example, include medication and monitoring codes as part of their definition of a patient with diabetes (e.g. \cite{Mulnier2006}) while others do not (e.g. \cite{Kontopantelis2014}). Not having access to code-lists means that it is difficult to know whether fair comparisons are being made between studies. Fourth, building code lists is a time consuming process; having access to historical code lists would mean that new lists could be built incrementally and iteratively, saving much `reinvention of the wheel' while increasing consistency, and potentially accuracy, of definitions across studies.
132
132
133
133
134
134
@@ -189,15 +189,15 @@ \subsection*{Database Architecture and Web Interface}
189
189
\section*{Acknowledgments}
190
190
We are thankful to Matt Ford for extensive technical support. Thanks to the Research team at CPRD for fruitful discussions in the development stage.
191
191
192
-
\subsection*{Funding statement}
193
-
This work is funded by the National Institute for Health Research (NIHR) School for Primary Care Research (SPCR).
192
+
%\subsection*{Funding statement}
193
+
%This work is funded by the National Institute for Health Research (NIHR) School for Primary Care Research (SPCR).
194
194
195
-
\subsection*{Disclaimer}
196
-
This article presents independent research funded by the National Institute for Health Research (NIHR). The views expressed are those of the author(s) and not necessarily those of the NHS, the NIHR or the Department of Health.
195
+
%\subsection*{Disclaimer}
196
+
%This article presents independent research funded by the National Institute for Health Research (NIHR). The views expressed are those of the author(s) and not necessarily those of the NHS, the NIHR or the Department of Health.
197
197
198
-
\section*{Author Contributions}
198
+
%\section*{Author Contributions}
199
199
200
-
Conceived, designed and built the website and software: DAS. Data collection: DAS, DR, EK, IO, RP, DA, EC. Data Analysis: DAS. Wrote the manuscript DAS. Edited the manuscript DAS, DR, EK, IO, RP, DA, EC.
200
+
%Conceived, designed and built the website and software: DAS. Data collection: DAS, DR, EK, IO, RP, DA, EC. Data Analysis: DAS. Wrote the manuscript DAS. Edited the manuscript DAS, DR, EK, IO, RP, DA, EC.
\signature{David A Springate \\ Research Fellow \\ Institute of Population Health \\ University of Manchester}
3
+
\begin{document}
4
+
\begin{letter}{Institute of Population Health \\ University of Manchester \\ UK}
5
+
\opening{Dear Sirs,}
6
+
7
+
We would like the editors to consider our article entitled ``ClinicalCodes: An online clinical codes repository to improve the validity and reproducibility of research using electronic medical records'' for publication in PLoS One.
8
+
9
+
In this manuscript, we describe a new online database for lists of clinical codes (www.clinicalcodes.org) for use by researchers using electronic medical records (EMRs). This resource will allow for clinical researchers to better validate electronic medical records studies, build on previous clinical code lists and compare condition definitions across studies. It will also assist health informaticians in replicating database studies, tracking changes in disease definitions or clinical coding practice through time and sharing clinical code information across platforms and data sources as research objects.
10
+
11
+
Despite accurate definitions of medical conditions being a prerequisite for valid EMR studies and these definitions depending upon careful selection of clinical codes, the publication of clinical codes is rarely, if ever, a requirement for obtaining grants, validating protocols or publishing research. We evaluated the levels of transparency in the reporting of clinical code lists in a representative study of UK primary care database studies. Of the 392 studies we examined, only 35 (9\%) published the entire set of clinical codes lists needed to reproduce or validate the study. These were most often published in online appendices.
12
+
13
+
We identify four main consequences of lack of transparency of clinical codes lists:
14
+
15
+
\begin{enumerate}
16
+
\item Code lists are not subject to scrutiny or peer review
17
+
\item It is impossible to tell if differences in found in study replications are genuine or due to artifactual differences in code lists
18
+
\item Comparisons between studies of the same clinical conditions are potentially invalidated
19
+
\item Lack of access to historical code lists leads to much wasted effort on the part of researchers
20
+
\end{enumerate}
21
+
22
+
23
+
The database described here will provide a centralised repository for EMR researchers to deposit their codes and this will lead to greater transparency, reproducibility and validity in this important area of research.
24
+
25
+
We believe this submission fits all the PLoS ONE criteria for database papers, namely utility, validity and availability. The resource will be of great use to the EMR community and we expect the paper to be highly referenced and the ClinicalCodes database to becomes the de facto repository for clinical code lists across EMR research. The database is an effective repository for clinical code lists and we are aware no similar open repositories for clinical codes. The database is written entirely using open source software and is freely available for access, upload and download. In addition, we have developed open source software to access the database programmatically and to download research objects for integration with other systems.
26
+
27
+
We would like to recommend Irene Petersen from UCL as an Academic Editor.
0 commit comments