Prediction of citation counts for clinical articles at two years using data available within three weeks of publication: retrospective cohort study
BMJ 2008; 336 doi: https://doi.org/10.1136/bmj.39482.526713.BE (Published 20 March 2008) Cite this as: BMJ 2008;336:655
All rapid responses
Rapid responses are electronic comments to the editor. They enable our users to debate issues raised in articles published on bmj.com. A rapid response is first posted online. If you need the URL (web address) of an individual response, simply click on the response headline and copy the URL from the browser window. A proportion of responses will, after editing, be published online and in the print journal as letters, which are indexed in PubMed. Rapid responses are not indexed in PubMed and they are not journal articles. The BMJ reserves the right to remove responses which are being wilfully misrepresented as published articles or when it is brought to our attention that a response spreads misinformation.
From March 2022, the word limit for rapid responses will be 600 words not including references and author details. We will no longer post responses that exceed this limit.
The word limit for letters selected from posted responses remains 300 words.
Here is another study that found a high correlation between early downloads and later citations (Brody et al 2006).
It will be important to validate all these metrics (including citations) against peer evaluation too (Harnad 2007).
Brody, T., Harnad, S. and Carr, L. (2006) Earlier Web Usage Statistics as Predictors of Later Citation Impact. Journal of the American Association for Information Science and Technology (JASIST) 57(8) pp. 1060 -1072. http://eprints.ecs.soton.ac.uk/10713/
Harnad, S. (2007) Open Access Scientometrics and the UK Research Assessment Exercise. In Proceedings of 11th Annual Meeting of the International Society for Scientometrics and Informetrics 11(1), pp. 27- 33, Madrid, Spain. Torres-Salinas, D. and Moed, H. F., Eds. http://eprints.ecs.soton.ac.uk/13804/
Competing interests: None declared
Competing interests: No competing interests
A recent paper presented an interesting model to predict citation counts for clinical articles 1. This topic is that important that we can predict that the paper will likely attract many citations. We first want to clarify some of the nomenclature of validation of prediction models, to avoid confusion in future reporting. The authors randomly divided 1274 articles into a derivation data set of 757 articles for development of a prediction model and a validation dataset for testing of 504 articles, after exclusion of outliers with >150 citations. This procedure is an example of a ‘split-sample’ approach. The authors however refer to it as ‘cross-validation’. Cross-validation would mean that we develop a model in the first part of the data and test it in the second part, and then repeat the procedure with development in the second part and testing in the first. The authors report that explained variation (R2) decreased from 0.60 at development to 0.56 at validation, and refer to this decrease as ‘shrinkage’. Shrinkage is not an appropriate term for this decrease; a better label is ‘optimism’ 2 3. Optimism is the phenomenon that prediction models tend to perform poorer in new data than in the data where the model was developed; it occurs especially when many predictors are considered in relatively small data sets 4. Ironically, a need for ‘shrinkage’ is well illustrated in Fig 2, where we note that the residuals are generally positive for low predictions (which were often too low), and generally negative for high predictions (which were often too high) 1. Shrinkage should be applied to the regression coefficients for more reliable predictions 2 4 5 6. How valid is this model to predict citations? First, the authors did not shrink regression coefficients, which implies that high predictions will be too high and low predictions too low for articles fulfilling the inclusion criteria. Second, for a future article we cannot know beforehand whether the article is an outlier, i.e. having more than >150 citations. Exclusion of outliers at validation is artificial and should not have been done; it has inflated the R2 of the model. As always with prediction models, future validation is required and may reveal disappointing performance.
Ewout W Steyerberg Hester Lingsma
References
1. Lokker C, McKibbon KA, McKinlay RJ, Wilczynski NL, Haynes RB. Prediction of citation counts for clinical articles at two years using data available within three weeks of publication: retrospective cohort study. Bmj 2008;336(7645):655-7. 2. Harrell FE, Jr., Lee KL, Mark DB. Multivariable prognostic models: issues in developing models, evaluating assumptions and adequacy, and measuring and reducing errors. Stat Med 1996;15(4):361-87. 3. Steyerberg EW, Harrell FE, Jr., Borsboom GJ, Eijkemans MJ, Vergouwe Y, Habbema JD. Internal validation of predictive models: efficiency of some procedures for logistic regression analysis. J Clin Epidemiol 2001;54(8):774-81. 4. Steyerberg EW, Eijkemans MJ, Harrell FE, Jr., Habbema JD. Prognostic modeling with logistic regression analysis: in search of a sensible strategy in small data sets. Med Decis Making 2001;21(1):45-56. 5. Copas JB. Regression, prediction and shrinkage. J R Stat Soc, Ser B 1983;45(3):311-354. 6. van Houwelingen JC, Le Cessie S. Predictive value of statistical models. Stat Med 1990;9(11):1303-25.
Competing interests: None declared
Competing interests: No competing interests
Prediction of citation counts: a comparison of results from alternative statistical models.
We would like to thank those who responded to our article “Prediction of citation counts for clinical articles at two years using data available within three weeks of publication: retrospective cohort study” 1 and to provide some further details.
1. Steyerberg & Lingsma2 suggest that our use of the terms cross- validation and shrinkage were misused. We stand behind their use as being consistent with the definitions given, for example, in Everitt’s The Cambridge Dictionary of Statistics (2002)3: Cross-validation: The division of data into two approximately equal sized subsets, one of which is used to estimate the parameters in some model of interest, and the second is used to assess whether the model with these parameter values fits adequately. (p 102) Shrinkage: The phenomenon that generally occurs when an equation from, say, a multiple regression, is applied to a new data set, in which the model predicts much less well than in the original sample. In particular the value of the multiple regression coefficient becomes less i.e. it ‘shrinks’. (p344)
2. Steyerberg & Lingsma2 also note that the residuals are generally positive for low predictions (which were often too low), and generally negative for high predictions (which were often too high).’ A Lowess smoothed curve of the expected residuals that we have added to the published residual plot indicates almost equal probabilities of over or under prediction at lower citation counts (see Fig 1); above 50 citations, there is a tendency to over predict citations but there are very few observations at this level. Also, the size of the residuals is relatively small (-5 or -10) given the large number of predicted citations (70-100) for these papers.
3. Based on various regression diagnostics, the outlier articles that we did remove were repeatedly showing undue influence on the analysis. We understand that to truly get a model predicting all citation counts they should have been left in. In response to other comments from readers of the article [P. Davies, personal communication], we have now performed a negative binomial regression (NBREG) which better models count data when the variance is greater than the mean, as is the case with our dataset. The results from this analysis (Table 1) indicate that the majority of the variables showed consistent results in terms of their coefficients and statistical significance, with the exception being the proportion of articles abstracted in Evidence Based journals, a variable which becomes non-significant in the NBREG. The discrepancy in the proportion of articles abstracted is a result of collinearity with the variable ‘number of databases indexing the journal’. We performed both analyses including or excluding the indexing database variable from the regression model. The “proportion abstracted” variable becomes significant in the NBREG once the number of databases indexed is removed. Overall, the NBREG results are fairly consistent with those from the OLS.
Our original model accounted for 60% of the variation in citation counts. After comparing the two sets of results, we feel that the conclusions from our original linear model are quite robust, but that the NBREG model provides a better description of the data because it includes the outlying data points. Fig 2 plots the residuals from the NBREG model which shows that it predicts accurately, up to about 200 citations.
1. Lokker C, McKibbon KA, McKinlay RJ, Wilczynski NL, Haynes RB. Prediction of citation counts for clinical articles at two years using data available within three weeks of publication: retrospective cohort study. BMJ 2008; 336:655-7.
2. Steyerberg, EW, Lingsma, HF (2008). Validating prediction models. BMJ 2008; 336:789-789 http://www.bmj.com/cgi/eletters/336/7645/655#192707
3. Everitt BS. The Cambridge Dictionary of Statistics. Cambridge, Cambridge University Press, 2006
Competing interests: The authors are employees of McMaster University. McMaster University owns the intellectual property for some of the processes described in the study including the McMaster premium literature service and the McMaster online rating of evidence (MORE), which are used to appraise and select articles for ACP Journal Club, Evidence-Based Medicine, Evidence-Based Nursing, bmjupdates+, BMJ Clinical Evidence, Medscape Best Evidence Alerts, Physicians Information and Education Resource, and Harrison’s Practice. Coauthors CL, RBH, KAM and NLW are or have been employed in part by contracts between McMaster University and the publishers of these information services.
Competing interests: No competing interests