Construction of the cancer patients' database based on the US National Health and Nutrition Examination Survey (NHANES) datasets for cancer epidemiology research.

Bibliographic Details
Title: Construction of the cancer patients' database based on the US National Health and Nutrition Examination Survey (NHANES) datasets for cancer epidemiology research.
Authors: Moon, Jinyoung1 (AUTHOR) pollux@snu.ac.kr, Mun, Yongseok2 (AUTHOR) skewery@gmail.com
Source: BMC Medical Research Methodology. 1/24/2025, Vol. 25 Issue 1, p1-7. 7p.
Subject Terms: *NATIONAL Health & Nutrition Examination Survey, *EPIDEMIOLOGY of cancer, *PERFLUOROOCTANOIC acid, *ENVIRONMENTAL exposure, *SULFONIC acids
Abstract: Background: The US National Health and Nutrition Examination Survey (NHANES) dataset does not include a specific question or laboratory test to confirm a history of cancer diagnosis. However, if straightforward variables for cancer history are introduced, US NHANES could be effectively utilized in future cancer epidemiology studies. To address this gap, the authors developed a cancer patient database from the US NHANES datasets by employing multiple R programming codes. Methods: To illustrate the practical application of this methodology to a real-world problem, the authors extracted the R codes applied in an academic paper published in another journal on January 30th, 2024 (https://doi.org/10.1016/j.heliyon.2024.e24337). This paper will focus on the construction of the database and analysis using R codes. Entire. Results: In the first example, the urine concentration of monocarboxynonyl phthalate, monocarboxyoctyl phthalate, mono-2-ethyl-5-carboxypentyl phthalate, and mono-2-hydroxy-iso-butyl phthalate (all ng/mL) were used as the independent variable, instead of the serum concentration of perfluorooctanoic acid (PFOA), perfluorooctane sulfonic acid (PFOS), perfluorohexane sulfonic acid (PFHxS), and perfluorononanoic acid (PFNA), respectively. In the second example, the serum concentration of 2,3,3',4,4'-Pentachlorobiphenyl (PCB105), 2,3,4,4ยด,5-Pentachlorobiphenyl (PCB114), 2,3',4,4',5-Pentachlorobiphenyl (PCB118), and 2,2',3,4,4',5'- and 2,3,3',4,4',6-Hexachlorobiphenyl (PCB138) were used as the independent variable, instead of the serum concentration of PFOA, PFOS, PFHxS, and PFNA, respectively. Discussion: This research offers a comprehensive set of R codes aimed at creating a single, user-friendly variable that encapsulates the history of each type of cancer while also considering the age at which the diagnosis was made. The US NHANES provides a wealth of critical data on environmental toxicant exposures. By employing these R codes, researchers can potentially discover numerous new associations between environmental toxicant exposures and cancer diagnoses. Ultimately, these codes could significantly advance the field of cancer epidemiology in relation to environmental toxicant exposure. [ABSTRACT FROM AUTHOR]
Copyright of BMC Medical Research Methodology is the property of BioMed Central and its content may not be copied or emailed to multiple sites or posted to a listserv without the copyright holder's express written permission. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract. (Copyright applies to all Abstracts.)
Database: Academic Search Complete
Full text is not displayed to guests.
More Details
ISSN:14712288
DOI:10.1186/s12874-025-02478-5
Published in:BMC Medical Research Methodology
Language:English