### INTRODUCTION

*Archives of Plastic Surgery*(

*APS*) adheres to the guidelines and best practices of the International Committee of Medical Journal Editors (ICMJE). The ICMJE recommendations for statistics state that authors should describe statistical analyses with enough detail to enable a reader to verify the reported results and that authors need to provide appropriate indicators of measurement errors or uncertainty, such as confidence intervals, beyond the P-value [2,3]. Furthermore, they recommend specifying the statistical software program(s) and versions used.

*APS*from 2012 to 2017.

*APS*is the official journal of the Korean Society of Plastic and Reconstructive Surgeons and is published 6 times per year. Since 2012, it continues the

*Journal of the Korean Society of Plastic and Reconstructive Surgeons*, which was launched in 1974. This review article provides an overview of recent trends in the statistical methodology used in

*APS*.

### METHODS

*APS*from 2012 to 2017. Case reports, ideas and innovations, review articles, and letters were excluded. Of these articles, 230 (59.3%) used statistical methods to analyze data and to report results. We classified them according to the types of statistical methods and software used, and checked whether there were errors in the description of statistical methods and results. We counted the number of statistical methods applied. When multiple statistical analyses were used in a study, each method was counted separately. The Cochran-Armitage trend test was used to assess the presence of linear trends in the percentages of statistical methods and statistical software used in the published articles by year from 2012 to 2017. R version 3.4.2 (R Foundation for Statistical Computing, Vienna, Austria) was used to perform the statistical tests, and P-values <0.05 were considered to indicate statistical significance.

### Statistical methods according to the objective of the analysis

*APS*include the normality test, power analysis, multivariate analysis, and reliability analysis using such as the intraclass correlation coefficient, Bland-Altman plots, the Cronbach alpha, and the kappa statistic.

### Parametric or nonparametric methods

### Errors in reporting statistical methods and results

*APS*. Errors in presenting P-values were observed, such as writing “P=0.00” or “P=1.00” instead of indicating that the P-value was very small or large (e.g., P<0.001 or P>0.999) and insufficient descriptions of the P-value, such as mentioning only the significance of the results without an exact P-value. Errors in describing the statistical methods were evaluated in terms of whether the applied statistical methods were described in the Methods section and whether the description of the applied statistical methods was complete and correct.

### RESULTS

### Frequency and types of statistical methods

*APS*between 2012 and 2017, 230 (59.3%) used one or more statistical method. Fig. 1 shows a statistically significant increase in the number of articles that used statistical methods over 6 years (P for trend=0.023). In 2012 and 2013, the percentage of articles using statistics was around 50%. In 2017, 64.7% of the articles published in

*APS*used statistical methods. The number of statistical methods used per article in

*APS*was 1.87±1.06 (mean±standard deviation). Almost half of the articles (47.1%) using statistics employed one method (Table 2). One article used six statistical methods.

*APS*by year. There were 261 applications of statistical methods for continuous or ordinal outcomes, and 139 applications of statistical methods for categorical outcomes. Statistical methods for comparisons of independent samples were most commonly used. Among the methods of comparison, the Pearson chi-square test (17.4%) and the Fisher exact test (11.3%) for categorical outcomes and the Mann-Whitney U test (14.4%) and independent t-test (13.7%) for continuous or ordinal outcomes were the most frequently used methods. The Wilcoxon signed-rank test, paired t-test, Kruskal-Wallis test, and analysis of variance (ANOVA) were also widely used, accounting for more than 7% of the published articles using statistics. Within the category of regression analysis, logistic regression was used almost twice as much as linear regression. More complicated methods, such as repeated-measures ANOVA or linear mixed models, were applied in very few articles.

### Errors in reporting statistical methods and results

### Statistical software packages

*APS*that used statistical methods, 165 (71.7%) provided details about the statistical software programs used for analyses. Seventy-five articles did not provide any such information. The percentage of articles presenting information about the statistical software used has increased by over 10%, from 71.9% in 2012 to 84.8% in 2017, although a statistically significant increasing trend was not observed (P for trend=0.597) (Fig. 2).

### DISCUSSION

*APS*from 2012 to 2017 with respect to the use and type of statistical methods and statistical software packages. The results showed an increasing trend in the application of statistical methods and the use of statistical software packages.

*Statistics in Medicine*. He found a considerable increase in the use of statistics and reported that a much greater use of complex statistical methodology in medical research was detected. The review articles regarding the use of statistics in medical journals [6-9] reflect Altman’s findings. Altman [10] also said as a final comment, “Reviewing medical papers is difficult, time-consuming, occasionally frustrating, and educational. Many journals are desperate for expert statistical help.”

*APS*invited a statistical editor to join the editorial team in 2012, and started having statistical reviewers assess the submitted articles to improve the quality of statistical applications.

*APS*, there were some statistical errors in the articles, including the presentation of P-values and the description of statistical methods and/or statistical software used. Some authors stated whether the results were statistically significant without providing exact P-values, especially for non-significant results; frequently presented as “P=NS.” Moreover, some authors did not report the P-values throughout the article even for significant results, only stating whether the results were statistically significant. The exact P-values are useful information for interpreting the statistical results of hypothesis testing. A very small P-value indicates that the null hypothesis is very incompatible with the data that have been collected [11-13]. Some software packages output results with the P-value listed as 0.000 or 1.000. Researchers usually copy and paste the P-value into the paper as is; however, such values should be presented as “P<0.001” or “P>0.999.” “P=0.000” means that there is absolutely zero chance of getting the results (and more extreme results) if the null hypothesis is true. However, there is always some chance of such an outcome, and we cannot definitively say that the probability is either 0 or 1. Some authors reported P-values without details regarding the data (e.g., summary estimates such as mean±standard deviation, number [%], or odds ratio). The P-value has nothing to do with the magnitude or the importance of an observed effect [11,12]. For example, a difference in the visual analogue scale for pain assessment before and after surgery of 0.1 with a P-value of 0.2 would be interpreted as a non-significant difference, while a difference of 0.01 with a P-value of 0.003 would be presented as significant. As argued by Wasserstein and Lazar [13], statistical significance is not equivalent to scientific, human, or economic significance. Recently, some statements about the misuse of P-values were announced by a statistical society [13] and presented in a major medical journal [14]. To provide a broad and appropriate interpretation of the results of research, authors should report not only P-values with summary estimates, but also uncertainty measures such as the 95% confidence interval and/or standard error of estimates.

*APS*state that “methods of statistical analysis and criteria for statistical significance should be described” in the Methods section. Not only the names of the statistical analyses, but also the objectives of the study for using statistical methods should be described in detail in the Methods section.

*APS*. Reliability analyses for evaluating internal consistency, test-retest repeatability, or inter-rater agreement are performed to assess reproducibility or repeatability among techniques/modalities or human readers. Power analysis is needed when planning a prospective study to achieve an adequate number of subjects. One may want to perform power analysis if non-significant results are obtained due to a small sample size.

*APS*. In conclusion, the use of statistical methods has increased in

*APS*over the last 6 years. Although there is room for improvement, researchers have been paying more attention to the proper use of statistics in recent years. These positive trends in

*APS*are expected to continue in the future.