When an interviewer forgets to ask a respondent a particular question the nonresponse is called missing data?

Access through your institution

Volume 18, Issue 2, February 2022, Pages 2308-2316

https://doi.org/10.1016/j.sapharm.2021.03.009Get rights and content

There are two types of people: 1) Those who can extrapolate from missing data.
- A random T-Shirt I saw.

Missing data is when an observation has no value assigned to it. For any particular data set, missing data is present in cases where, for any item, an input has not been entered or generated. In surveys, a respondents’ response value is not available for it to be taken further for analysis.

There are multiple reasons why surveys can have missing data. For example, respondents may have skipped questions, data encoding caused variables to be counted as null or missing, the internet may have cut out during data gathering with electronic devices, a page of printed information may be missing, or a response item is deemed invalid.

Whether intended or unintended, classifications for missing data have been developed to describe the type of missingness. Classifying missing responses allows for decisions to be made on how to handle missing data and when reporting, how to inform readers of the considerations that were taken to mitigate or minimise missing values.

A recent review into missing data in pharmacy literature highlighted that a low proportion of studies reported on how missing data was handled.1 A lack of reporting can lead to bias in the interpretation of findings and validity of the research. The aim of this paper is to introduce the concept of missing data, how missing data is categorized as well as introduce common techniques to account for and report on missing data.

Before a decision could be made about what to do with the missing data, the type of missingness needs to be characterised. Consideration of missing data requires both subjective and objective analyses. Missing data may be classified according to the degree of randomness with three categories described; Missing at Random (MAR), Missing Completely at Random (MCAR) or ignorable missingness, and, Missing Not at Random (MNAR), also known as non-ignorable missingness.2,3

MCAR is when a missing value is not related to any other value in the data set.4,5 Conceptually, data that are MCAR are not usually attributed to a question in the survey or other phenomenon, whether observable or unobservable. Assume for example, a question being asked relates to income and is represented by the letter X1, while another question relates to occupation and is represented by the letter X2. In MCAR, the reason for X1 (income) having a missing response is not because of X1 (income) or X2 (occupation) i.e., neither the survey question, nor another confounder is the reason for the missing value. When MCAR is suspected, Little's Test of Missingness can be used to determine whether the missing values meet the specification of MCAR.6 A significant p-value result indicates that we reject the null hypothesis and assume that a pattern exists to the missing data (not MCAR). Little's Test of Missingness is available in most statistical software packages, either as a direct test or via a macro.

Data that are MAR are missing based on another observable instance, such as an underlying or confounding factor causing respondents to not answer questions. Certain groups may not respond to a question, as a result of an underlying reason. For instance, individuals with high paying jobs may not be inclined to answer questions that relate to finance. This is both theoretically and conceptually true, as research indicates that higher income earners are more likely non-responders of income questions.7 Using the example from MCAR above where X1 is income and X2 is occupation. The reasons why X1 (income) may not be reported is based on X2 (occupation), where those with higher paying occupations are less inclined to provide a response.8 Thus, in the case of MAR, the reason for X1 having a missing response is based on X2, another variable.

Finally, MNAR, or data that contains non-ignorable missingness, are data that do not meet the criteria of either MCAR or MAR. Unlike MCAR and the use of an objective statistical test, subjective analysis is required to ascertain whether data are MNAR. In MAR, there may be a correlation between an observable phenomenon and why data are missing, but not a direct cause. Data that are MNAR, on the other hand, can be attributed to an unobservable factor that is directly affecting the reason that the data values are missing. This can be the question itself being the cause of the missing response, or underlying assumptions.5 Using another example in a survey of overall health, assume X1 is a depression related question and X2 is gender. X1 (depression) can have a missing response based on X2 (gender) where men are less likely to talk about depression. This case would be MAR. On the other hand, if it is the level of depression, X1, that is causing the person to provide a null response, then the missingness is MNAR. This is where the cause of the missingness is the phenomenon that is being evaluated by the item itself, which in this case is X1.

To summarise the three categories, assume X1 is the variable with missing responses and X2 is another variable:MCAR = Neither X1 nor X2, can explain the missingness. Mathematically from Little's Test, “No pattern exists.”MAR = Missingness of X1 is based on X2, where X2 is another variable in the datasetMNAR = Missingness of X1 is based on X1 itself or another phenomenon that is rarely observed. Cannot be attributed to another observable dataset variable

Ideally, consideration of how to avoid missing data should be part of the initial survey design, sampling strategy, as well as the data analysis plan. Estimation of the proportion of missing data may be inferred from literature as well as pilot studies. The estimated proportion of missing data obtained allows for improved survey sample calculation.

If participants forget to answer a question or refuse to answer a question, then that information will not be collected. Missing by design is when

Data can be missing from two levels, either the variable (item) level or the case (individual) level.18 The item level non-response is where the data for a particular item is missing for a very high proportion of participants. For example, most respondents may have answered the whole survey, except that many have missed all the items regarding income, particularly if administered in a cohort of high wealth individuals. A non-response on the case level is where information pertaining to a

Of important mention is the missingness in trials and longitudinal studies. During the data gathering phase of these studies, loss of follow up can lead to missing entries. In clinical trials patients may be withdrawn due to side effects or alternative treatments provided. At other times, patient dropout, with no indication of the cause, and this can also lead to missing values. Clinical trials papers and discussions on preventing and handling the missing values have been written at length.34

With multiple approaches available, it is important that the method/s of handling missingness be reported. Many fields require specifics in reporting on missing data, and this is left up to the researcher to determine. However, most readers would be interested in the responses to the following questions related to the missingness:

•

What is the percentage of missing values?

•

How did the missingness develop? Was it respondent allocated, administration related or other?

•

How the missingness was

Missing data needs to be considered throughout the course of survey-based research, from planning through to reporting. This paper has introduced multiple approaches for handling missing survey data and presented a guide for when these approaches should be used. It is essential to consider and report on missing data to accurately report the findings of a survey study.

Ardalan Mirzaei: Conceptualization, Methodology, Software, Formal analysis, Investigation, Data curation, Writing – original draft, Writing – review & editing, Visualization, Project administration. Stephen R Carter: Conceptualization, Methodology, Investigation, Data curation, Writing – original draft, Writing – review & editing, Supervision. Asad E Patanwala: Conceptualization, Methodology, Writing – original draft, Writing – review & editing. Carl R Schneider: Conceptualization, Methodology,

We would like to thank Dr Jack Collins for his initial read of the manuscript.

A. Mirzaei et al.
A. Mirzaei et al.
A. Doan et al.
A.R. Donders et al.
R.M. Durand et al.
S. Rolstad et al.
J. Dirmaier et al.
S.W. Rosenberg
S.W. Narayan et al.
P.D. Allison

D.B. Rubin

R.J. Little et al.

J.P. Vandenbroucke et al.

R.J. Little

G. Turrell

W.S. Aquilino

T. Little et al.

A. Pokropek

Y. Kim et al.

H. Schuman et al.

S. Laaksonen

Y. Dong et al.

The self-organizing map is an unsupervised neural network which is widely used for data visualization and clustering in the field of chemometrics. The classical Kohonen algorithm that computes self-organizing maps is suitable only for complete data without any missing values. However, in many applications, partially observed data are the norm. In this paper, we propose an extension of self-organizing maps to incomplete data via a new criterion that also defines estimators of the missing values. In addition, an adaptation of the Kohonen algorithm, named missSOM, is provided to compute these self-organizing maps and impute missing values. An efficient implementation is provided. Numerical experiments on simulated data and a chemical dataset illustrate the short computing time of missSOM and assess its performance regarding various criteria and in comparison to the state of the art.
Child care centers (CCC) can be strategic settings to establish healthy lifestyle behaviors through obesity prevention programs. Fidelity to the implementation of such programs is a vital evaluation component, but is often not measured. This study assessed CCC teacher fidelity to the implementation of “Healthy Caregivers, Healthy Children (HC2)”, a CCC-based obesity prevention intervention.
CCCs serving low-resource, ethnically diverse families with ≥ 50 children ages 2-to-5 years old that were randomized to the HC2 intervention and that had teacher fidelity data collected (n = 9 CCC) were included in this analysis. The Environment and Policy Assessment and Observation (EPAO) tool assessed the CCC nutrition and physical activity (PA) environment at the beginning/end of the school year. Fidelity assessments were conducted in CCCs randomized to HC2 in Spring 2016 (n = 33 teachers) and 2017 (n = 39 teachers) by a trained observer. The relationship between teacher fidelity and EPAO was assessed via mixed models.
For every-one unit rise in teacher fidelity, EPAO nutrition increased 0.055 points (p =.006). No significant relationship was shown between teacher fidelity and EPAO PA score (p =.14).
Teacher fidelity to obesity prevention program implementation may support a healthy CCC obesity prevention and nutrition environment but might require continued support for all components.
Job stress, burnout, and fulfillment can be modeled using the Job Demands and Resources model (JD-R).
This study explores the relationship between job demands and burnout and professional fulfillment in pharmacists, and the moderating role of job resources.
Data were obtained from the 2019 National Pharmacist Workforce Survey of a random sample of U.S. licensed pharmacists. The survey assessed pharmacist demographics (age, gender, and work setting), job demands (workload and work-home conflict), job resources (job control, time spent in various work activities, and social support), as well as burnout and professional fulfillment. Hierarchical regression analyses were used to assess the relationship between job demands-resources variables and pharmacists’ professional fulfillment and burnout. Moderation was assessed by including interaction terms (job demands x job resources) in the regression models. The change in marginal mean burnout and professional fulfillment for different combinations of job demands and job resources was used to assess the salience of significant moderation effects.
Women and community pharmacists accounted for 64.8% and 45.8% of the study sample, respectively. Age was negatively associated with burnout. Job demands were positively associated with burnout and negatively associated with professional fulfillment, and the converse was true for job resources. Significant moderation effects were observed for 7 out of 12 interaction terms assessed. The moderating effect of job resources was more salient under varying conditions of job demands in 5 out of 7 instances where significant interaction effects were observed.
While pharmacist characteristics explained a significant amount of variation in burnout and professional fulfillment, also considering the moderating effects of job resources on the association of job demands with burnout and professional fulfillment identified additional information, such as the increased importance of job control and task variety in high workload environments.
Bearing fault diagnosis in real-world applications has challenges such as insufficient labeled data, changing working conditions of the rotary machinery, and missing data due to multi-rate sampling of sensors. Despite the numerous applications of conventional deep learning (DL) and domain adaptation methods in bearing fault diagnosis, these methods face challenges. Domain adaptation techniques neglect alignment across subdomains with the same class, and DL techniques do not consider data relationships and interdependencies. To tackle these challenges, this paper introduces a novel semi-supervised method based on ARMA graph convolution, adversarial adaptation, and multi-layer multi-kernel local maximum mean discrepancy (MK-LMMD). Structural information of data is extracted using ARMA graph convolution, adversarial adaptation is employed to decrease structural distribution discrepancy in the domains, and MK-LMMD is used to align the classes. Additionally, ARMA graph convolution and MK-LMMD can aid in reducing distribution discrepancy caused by missing data and changing working conditions.
More than half of Danes buy organic food products every week; however, this has not been reflected in the retail sale of organic fish and shellfish. Therefore, this paper aims to perform consumer segmentation through the food-related lifestyle (FRL) instrument and determine the factors influencing intention to buy organic fish among Danish consumers applying the theory of planned behaviour (TPB). Survey data were collected using a validated questionnaire from 237 Danish convenient consumers. The structural equation model (SEM) was used to analyze the relationships between the TPB constructs. Consumer segmentation was based on the FRL instrument (incl. The shopping scripts, higher-order product attributes, and meal preparation scripts) as a basis for consumer segmentation. Factor analysis with hierarchical clustering yielded four consumer segments: the “Careless” (31.6% of the respondents), the “Rational” (17.3%), the “Cooks” (31.6%), and the “Eco-moderate” (19.4%). Consumers from the Careless segment had the highest percentage of respondents buying organic fish (39.1%), followed by those from the Cooks (33.1%). However, consumers from the Cooks segment purchase organic fish regularly, followed by the Careless segment (27.3% and 11.5%, respectively). The results from SEM indicated that past experience, perceived barriers such as difficulty to judge the quality, and availability of organic fish were significant predictors of the intention to buy organic fish. However, attitudes, subjective norms, and perceived price were not significant predictors of the intention to buy organic fish. The intention to buy organic fish showed a strong positive correlation with the reported consumption frequency of organic fish. Hence, focusing on perceived barriers, past experience with buying organic, and promoting availability among consumers is likely to trigger a behavioural intention of buying organic fish, thereby potentially increasing the purchasing frequency of organic fish.
Non-prescription medicines (NPMs), while relatively safe, are responsible for a small but significant proportion of medication misadventure and inappropriate use may lead to avoidable healthcare cost. Some consumers vary their use of NPMs from the directions provided on packaging or advice from healthcare professionals. Consumers may use NPMs at lower doses or less frequently than directed because of the risk of side effects.
This study aimed to develop and validate a self-report measure for the extent to which consumers’ follow directions (FDs) for NPMs. Secondly, it aimed to explore the relationship between risk perception towards NPMs and following directions.

A cross-sectional study was administered online to participants who belong to an Australian agency which conducts consumer research. Participants were Australian adults who had used NPMs within the last month. Items for the FD-NPM scale were developed and validated. Exploratory factor analysis and confirmatory factor analysis were used to validate the FD-NPM scale. Structural equation modelling (SEM) was employed to explore the relationships between risk perception, covariates, and FDs.
There were 403 participants recruited. Less than 20% “always” or “often” self-reported following directions for dose, frequency, or duration of use. Factor analyses confirmed that there are two moderately positively correlated dimensions of FD-NPM (r = 0.46), which were named underuse and overuse. That is, consumers who self-reported underuse of non-prescription medicines were also more likely to self-report overuse. Consumers with high-risk perception towards NPMs, those who were younger and those who were more educated had a greater tendency to not follow directions.
A new self-report measure, the FD-NPM scale was developed and validated. That people who perceives NPMs to be harmful, tend to underuse and more concerningly, overuse them, is of great interest to clinicians and policymakers who are required to manage risk communications.

View all citing articles on Scopus

With the changing healthcare landscape, evaluating the care provision in ambulatory settings is vital to understand outpatient care. The national surveys such as the National Ambulatory Medical Care Survey (NAMCS) and the National Hospital Ambulatory Medical Care Survey (NHAMCS) are valuable resources to pharmacy researchers because of their availability and generalizability. With the recent focus on real-world data, the national surveys are critical in providing practice and policy evidence by evaluating ambulatory care, especially prescribing practices. The use of these surveys requires an understanding of the survey content, scope, complex sampling scheme, and analytical and research considerations. There are several methodological and practical considerations that make these national surveys useful to both novice and seasoned researchers. Although some generalized approaches are available for analyzing the national surveys, there is limited focus on the NAMCS and the NHAMCS. This paper provides an in-depth understanding of the NAMCS/NHAMCS, including methodological considerations for evaluating prescribing practices in ambulatory settings.
A methodological debate within social pharmacy is ongoing regarding how to apply a qualitative approach. This paper emanates from a workshop at the Nordic Social Pharmacy Conference in 2019, named ‘How do we know it's good? A workshop on quality criteria in qualitative social and clinical pharmacy research’, that addressed this debate. The aim of this paper is twofold (1) to present the main key points raised during the workshop and (2) based on these inputs to contribute to the ongoing discussion on qualitative methodology within social pharmacy research. This paper starts with what was discussed at the workshop and further elaborated are some of the challenges with conducting qualitative research within social pharmacy. These include methodological and disciplinary competence and insecurity, reflections on the consequences of that many social pharmacy researchers come from a natural science background and how this (possibly) shapes the practice of qualitative research within the field. For example, how concepts like transparency and saturation, together with checklists and quality criteria are understood and used. Finally, we make suggestions for the next step for qualitative research in social pharmacy.
Clinical and social pharmacy researchers often have questions regarding contingencies of effects (i.e., moderation) that are tested by including interactions in statistical models. Much of the available literature for estimating and testing effects that emanate from moderation models is based on extensions of the linear model with continuous outcomes. Binary (or dichotomous) outcome variables, such as prescription-medication misuse versus no misuse, are commonly encountered by clinical and social pharmacy researchers. In moderation analysis, binary outcomes have led to an increased focus on the fact that measures of interaction are scale-dependent; thus, researchers may need to consider both additive interaction and multiplicative interaction. Further complicating interpretation is that the statistical model chosen for an interaction can provide different answers to questions of moderation. This manuscript will: 1) identify research questions in clinical and social pharmacy that necessitate the use of these statistical methods, 2) review statistical models that can be used to estimate effects when the outcome of interest is binary, 3) review basic concepts of moderation, 4) describe the challenges inherent in conducting moderation analysis when modeling binary outcomes, and 5) demonstrate how to conduct such analyses and interpret relevant statistical output (including interpretations of interactions on additive and multiplicative scales with a focus on identifying which statistical models for binary outcomes lead to which measure of interaction). Although much of the basis for this paper comes from research in epidemiology, recognition of these issues has occurred in other disciplines.
Meta-analyses of clinical pharmacy services are frequently criticized for restricted data transparency and reproducibility.
To describe the methodological characteristics of meta-analyses of pharmacist-led medication reviews, to identify the elements that limit their replicability and robustness, and to propose recommendations for an appropriate conduction and reporting.
A meta-research study was conducted. Systematic searches of the PubMed, Scopus, and Web of Science databases were performed to identify meta-analyses of pharmacist services. Meta-analyses assessing the effect of pharmacist-led medication reviews were selected for data extraction, analysis and replication. Two replication exercises were performed for the two most common outcomes: (i) considering the data provided by authors to construct the meta-analysis and (ii) considering the raw data available in the primary studies included. Prediction intervals (PI), fragility index (FI), and number needed to treat (NNT) were also calculated for each replicated meta-analysis.
Nine studies reporting meta-analyses about pharmacist-led medication review were found comprising 30 different outcomes. Eleven meta-analyses, including six for hospital admission and five for mortality, were replicated. In five meta-analyses, the pooled effect sizes of the replicated meta-analyses differed from the original ones. Only four meta-analyses mentioned the statistical method used. Other meta-analytic parameters (e.g., q-value, tau2) were omitted in all studies. In nine meta-analyses, the data from primary studies had been incorrectly extracted for at least one variable. The PI demonstrated that the uncertainty intervals of the effect sizes were always underestimated by the authors. NNTs showed wide intervals, ranging from benefit to harm, in almost all meta-analyses. Nine recommendations to facilitate the replication of a meta-analysis were proposed: providing all original data needed to build the analysis; informing about the imputed data or data obtained from different sources; performing sensitivity analyses for imputed or unpublished data; inform about all the statistical methods used; providing all statistical results; and reporting the PI, FI and NNT.

Errors in data extraction and poor reporting of meta-analytic parameters are common in the pharmacy literature. We proposed nine recommendations to enhance data reproducibility and interpretability. Journal editors and peer reviewers should ensure that authors strictly comply with minimum standards for conduction and reporting of meta-analyses.
In many countries around the world, people go to community pharmacies to receive primary health care services. Awareness of public views and experiences may help to identify opportunities for greater uptake of primary health care services provided by pharmacists and ways to improve care. Arts-informed research offers the possibility to provide additional insights into public perceptions of community pharmacy services. The purpose of this exploratory study is to describe the process and results of an arts-informed research project using an adapted version of the draw and write technique in combination with focus group interviews to explore public perceptions of community pharmacy services. The draw and write technique was introduced as an introductory activity to evoke a visual expression of participants’ perceptions and experiences with community pharmacy services. Participants were invited to answer the question, “What do community pharmacy services mean to you?” in the form of a drawing and words. They were then prompted to discuss their drawings in a focus group interview. This approach resulted in rich visual and textual data. Analysis consisted of a combination of manual sorting of the visual data and examination of the focus group interview data that were transcribed verbatim, anonymized, and analyzed using an inductive comparative approach. NVIVO version 12 software was used to code and manage all data. Use of the draw and write technique elicited initial, fresh perspectives about community pharmacy services prior to discussions with participants in the focus group interviews. This approach allowed researchers to access a diverse range of experiences and perspectives.
Use of simulated patients (SP) to assess the quality of pharmacy services and impact of interventions is increasing. The CRiSP (Checklist for Reporting research using Simulated Patient methodology) checklist was recently developed, assisting researchers to report items necessary to meet a minimum agreed standard.
To identify which CRiSP items were reported in SP studies for community pharmacy research, identify any gaps in reporting and describe the overall quality of reporting for the SP studies identified.
Papers published during 2018–2020 using SP methodology in community pharmacy settings were identified from MEDLINE and Embase. The 50 most recent ones were selected. Data were extracted independently and in duplicate. Each paper received a coded numerical value denoting compliance with each item of CRiSP (1 = yes, 2 = no, 3 = unclear, 4 = not applicable, 5 = partially complete). Data were analysed using Microsoft Excel and reported as frequencies and percentages of each code for the checklist items, across the 50 papers.
No paper fulfilled all items in the CRiSP checklist. The mode(s) of delivery of SP assessments (item 17) was reported in all papers, while use of the term SP (item 1); number of SPs (4a); scenario details (9a); describing procedures12; data collection procedure (18); and ethics approval (23a) were reported in at least 80% of papers. Items not reported in over 50% of papers were: scenario development (8a), validation (8b) and flexibility (9b); materials used (10a) and copies of materials (10b); and procedures for SP identification (15). Researchers found interpretation of the checklist unclear and utilised working definitions to ensure consistency in coding.
This review identified that pharmacy research involving SP methodology is often inadequately reported by researchers. The CRiSP checklist is a comprehensive tool to assess the quality of SP methodology reporting but may require some refinement to ensure consistency in use.