隐私规避的网络调查与间接估计方法
DOI:
作者:
作者单位:

1.国防科技大学;2.盲信号处理国家重点实验室

作者简介:

通讯作者:

中图分类号:

C811

基金项目:

国家自然科学基金项目(面上项目,重点项目,重大项目),国家杰出青年科学基金


Network survey and indirect inference with privacy avoidance
Author:
Affiliation:

1.National University of Defense Technology;2.State Key Laboratory on Blind Signal Processing

Fund Project:

  • 摘要
  • |
  • 图/表
  • |
  • 访问统计
  • |
  • 参考文献
  • |
  • 相似文献
  • |
  • 引证文献
  • |
  • 资源附件
  • |
  • 文章评论
    摘要:

    在管理决策中,管理对象的真实状态往往因隐私、敏感等因素导致自我报告数据质量不高,样本数据存在较大偏差,进而难以掌握目标对象的真实情况。针对这一问题,同时为了满足数字经济时代下的数据隐私保护需求,本文开发了一类基于社交网络间接报告的数据采集方法,并在网络抽样与统计推断理论的基础上,设计了基于间接报告样本数据的总体估计方法(ECM)。该方法操作简单,可对调查对象进行随机采样或实施普查,除采集样本的自我陈述数据外,同时采集每个样本关于其密切社交对象的报告数据,从而避免了自身因敏感原因等不愿提供数据或提供不真实数据的问题,提出的估计方法能在样本报告数据的基础上实现对总体的高精度估计,并能实现自报告数据和他报告数据的交互验证。本文的研究方法在一个多达556,627名活跃用户的难接触人群在线社交网络上进行了充分验证,抽样实验表明ECM对全网平均好友数和总体特征的估计误差低于3%。进一步地,本文开展了实证研究,通过设计自报告和他报告问卷,对某企业职员的一般和隐私性问题进行了问卷调查,并通过间接估计方法实现了对目标的总体估计,展示了该方法的实用性和有效性。

    Abstract:

    In the process of management decision-making, the real state of the management objects is often subject to low-quality of self-reported data or large sampling biases due to concerns regarding privacy or sensitivity, which makes it difficult to know the real situation of the target objects. To solve this problem, yet to meet the data privacy protection demand in the era of the digital economy, this paper develops a data collection method based on social network indirect reports, and designs an ego-centric sampling method (ECM) based on indirectly reported sample data on the basis of network sampling and statistical inference theory. This method is simple to implement such that it can be deployed by either randomly sampling the survey objects or conducting a census. In addition to collecting the self-reported data of the samples, it also collects the report data of each sample about its close social contacts, so as to avoid the problem that they are unwilling to provide data or provide untrue data due to sensitive reasons. The proposed method can achieve a high-precision estimation of the population, and it can realize the interactive verification of self-reporting data and alter-reporting data. The research method is fully validated on the online social network of a hard-to-reach population with up to 556,627 active users. The sampling experiment shows that the estimation bias of ECM for the average number of friends and overall characteristics of the whole network is less than 3%. Furthermore, this paper conducts an empirical study by implementing a questionnaire survey on general and sensitive variables for employees in an enterprise, and achieved the overall estimation of the study objects through the indirect estimation method, verifying the practicality and effectiveness of ECM.

    参考文献
    相似文献
    引证文献
引用本文
分享
文章指标
  • 点击次数:
  • 下载次数:
  • HTML阅读次数:
  • 引用次数:
历史
  • 收稿日期:2022-12-26
  • 最后修改日期:2023-01-02
  • 录用日期:2023-01-05
  • 在线发布日期:
  • 出版日期:
您是第位访问者
管理科学学报 ® 2025 版权所有
通讯地址:天津市南开区卫津路92号天津大学第25教学楼A座908室 邮编:300072
联系电话/传真:022-27403197 电子信箱:jmsc@tju.edu.cn