
<?xml version="1.0" encoding="UTF-8"?><xml><records><record><source-app name="Biblio" version="7.x">Drupal-Biblio</source-app><ref-type>17</ref-type><contributors><authors><author><style face="normal" font="default" size="100%">Xavier Javines Bilon</style></author><author><style face="normal" font="default" size="100%">Jose Antonio R. Clemente</style></author></authors></contributors><titles><title><style face="normal" font="default" size="100%">Evaluation of sampling methods for content analysis of Facebook data</style></title><secondary-title><style face="normal" font="default" size="100%">The Philippine Statistician</style></secondary-title></titles><dates><year><style  face="normal" font="default" size="100%">2020</style></year></dates><urls><web-urls><url><style face="normal" font="default" size="100%">https://www.psai.ph/tps_details.php?id=125</style></url></web-urls></urls><language><style face="normal" font="default" size="100%">eng</style></language><abstract><style face="normal" font="default" size="100%">A methodological challenge for researchers performing content analysis on social media data involves deciding on a sampling procedure for obtaining content to be analyzed with least sampling error. The study used and recommended two different kinds of elementary unit—post and day—that allow probability sampling of Facebook data, regardless of whether the sampling frame of all posts within the time period of interest is obtainable. Four sampling designs for post as elementary unit and five for day as elementary unit—including three commonly used sampling options for content&lt;br&gt;analysis: simple random sampling without replacement (SRSWOR), constructed week sampling, and consecutive day sampling—were employed on Facebook data mined from Mocha Uson Blog from 2010 to 2018. Estimates for parameters, such as measures of user engagement and proportions of topic-related posts, were obtained at increasing sample sizes. Sampling designs for each elementary unit were evaluated by comparing the normalized area under the coefficient of variation curve (NAUCV) over the different sample sizes. For post as elementary unit, with content type as the stratification variable, stratified random sampling (StRS) using Neyman allocation based on total user engagement is&lt;br&gt;recommended (average NAUCV = 31.28%). For day as elementary unit, SRSWOR is recommended (average NAUCV = 42.31%).</style></abstract></record></records></xml>