自然辩证法研究

2026, 05, v.42 42-52

从大数据到大模型：人工智能如何重塑计算社会科学的研究范式？

杜娟¹ 李晓义²

1.天津城建大学理学院 2.南开大学旅游与服务学院

基金项目(Foundation): 国家社会科学基金一般项目“‘挤出’还是‘互补’?——物质激励与社会偏好之间相互作用的实验研究”(23BJL124)

邮箱(Email):

DOI: 10.19484/j.cnki.1000-8934.2026.05.005

发布时间： 2026-05-18

出版时间： 2026-05-18

移动端阅读

0	0	41
下载次数	被引频次	阅读次数

引用本文下载本文

PDF

引用导出

GB/T 7714-2015 MLA APA Refworks EndNote NoteExpress NoteFirst

摘要全文参考文献出版信息相关文章

摘要：

计算社会科学正经历一场由“大数据驱动”向“大模型驱动”的深刻范式革命。这场以大语言模型为核心的变革不仅是工具的迭代，更是一场重塑社会科学研究方法、认识论与权力结构的结构性变革。在方法论层面，研究全流程被根本性地改写：强大的“向量化表征”取代了透明的操作化指标，在提升测量效度的同时也带来了“有效性黑箱”；精细的文化结构描绘能力伴随着“算法幻觉”的风险；“人机共创”的理论构建新模式则面临着深刻洞见被精致平庸所取代的“理论退化”危机。这些变革共同指向了两个核心困境：在认识论上，表现为“预测”对“解释”的压倒性优势，社会科学面临丧失其因果解释使命的风险；在权力结构上，表现为知识生产工具与能力向少数科技平台的高度集中，学术共同体的自主性与批判性正遭受侵蚀。同时，研究的伦理焦点也从“数据隐私”转向了由模型直接造成的“算法伤害”与“价值对齐”争议。面对挑战，社会科学的回应之道在于培养一种能够连接模型技术输出与背后社会结构的“新社会学的想象力”。

关键词： 计算社会科学; 大数据; 大模型; 范式;

Abstract：

Computational Social Science( CSS) is undergoing a fundamental paradigm shift from a “big data-driven” to a “ large model-driven” approach. This article argues that the revolution spearheaded by Large Language Models(LLMs) is a structural transformation rather than a mere iterative tool upgrade, fundamentally reshaping the methodology, epistemology, and power dynamics of the social sciences. Methodologically, the transition toward“vectorized representation” over transparent operational indicators has introduced a “validity black box, ” where enhanced measurement masks a loss of clarity. Furthermore, the shift toward human-AI theoretical co-creation faces a “theoretical degradation, ” where profound insights risk being supplanted by polished mediocrity. These transformations point to two core crises: an epistemological imbalance where prediction eclipses causal explanation, and a concentration of knowledge-production power within tech platforms that erodes academic autonomy. Ethically, concerns have pivoted from data privacy to direct algorithmic harm and the complexities of value alignment. Ultimately, this paper advocates for a “new sociological imagination” capable of bridging model outputs with underlying social structures.

KeyWords： computational social science; big data; large models; paradigm;

如需获取全文，请访问cnki.net

参考文献

[1]Lazer D, Pentland A, Adamic L, et al. Computational Social Science[J].Science, 2009, 323(5915):721-723.

[2]Lazer D, Hargittai E, Freelon D, et al. Computational Social Science:Obstacles and Opportunities[J].Science, 2020, 369(6507):1060-1062.

[3]Bender E M, Gebru T, McMillan-Major A, et al. On the Dangers of//Stochastic Parrots:Can Language Models Be too Big?[C] Proceedings of the 2021 ACM Conference on Fairness, Accountability,and Transparency(FAccT'21).New York:ACM, 2021:610-623.

[4]Savage M, Burrows R. The Coming Crisis of Empirical Sociology[J].Sociology, 2007, 41(5):885-899.

[5]Blei D M. Probabilistic Topic Models[J].Communications of the ACM, 2012, 55(4):77-84.

[6]Goel S, Hofman J M, Lahaie S, et al. Predicting Consumer Behavior with Web Search[J].Proceedings of the National Academy of Sciences, 2010, 107(41):17486-17490.

[7]Conte R, Gilbert N, Bonelli G, et al. Manifesto of Computational Social Science[J].The European Physical Journal Special Topics,2012, 214(1):325-346.

[8]Tufekci Z. Big Questions for Social Media Big Data:Representative-//ness, Validity and Other Methodological Pitfalls[C] Proceedings of the 8th International AAAI Conference on Weblogs and Social Media(ICWSM-14). Ann Arbor:AAAI Press, 2014:505-514.

[9]Brown T B, Mann B, Ryder N, et al. Language Models are Few-//Shot Learners[C] Advances in Neural Information Processing Systems 33(NeurIPS 2020). Virtual:Curran Associates, Inc., 2020:1877-1901.

[10]Arora S, Li Y, Liang Y, et al. A Latent Variable Model Approach to PMI-based Word Embeddings[J].Transactions of the Association for Computational Linguistics, 2018, 6:513-527.

[11]Brynjolfsson E, McAfee A. The Turing Trap:The Promise&Peril of Human-Like Artificial Intelligence[J].Daedalus, 2022, 151(2):272-287.

[12]Park J S, O’Brien J C, Cai C J, et al. Generative Agents:Inter-//active Simulacra of Human Behavior[C] Proceedings of the 36th Annual ACM Symposium on User Interface Software and Technology(UIST'23). San Francisco:ACM, 2023:1-22.

[13]Mikolov T, Sutskever I, Chen K, et al. Distributed Representa-//tions of Words and Phrases and Their Compositionality[C] Advances in Neural Information Processing Systems 26(NIPS 2013).Lake Tahoe:Curran Associates, Inc., 2013:3111-3119.

[14]Bolukbasi T, Chang K W, Zou J Y, et al. Man is to Computer Programmer as Woman is to Homemaker? Debiasing Word Embeddings//[C] Advances in Neural Information Processing Systems29(NIPS2016).Barcelona:Curran Associates, Inc., 2016:4349-4357.

[15]Adamic L A, Glance N. The Political Blogosphere and the 2004 U.S. Election:Divided They Blog[C]//Proceedings of the3rd international workshop on Link discovery(LinkKDD'05). Chicago:ACM, 2005:36-43.

[16]Garg N, Schiebinger L, Jurafsky D, et al. Word Embeddings Quantify 100 Years of Gender and Ethnic Stereotypes[J].Proceedings of the National Academy of Sciences, 2018, 115(16):E3635-E3644.

[17]Kozlowski A C, Taddy M, Evans J A. The Geometry of Culture:Analyzing the Meanings of Class through Word Embeddings[J].American Sociological Review, 2019, 84(5):905-949.

[18]Birhane A, Kasirzadeh A, Ritchie D, et al. Science in the Age of Large Language Models[J].Nature Reviews Physics, 2023, 5:277-280.

[19]Breiman L. Statistical Modeling:the Two Cultures[J].Statistical Science, 2001, 16(3):199-231.

[20]Hofman J M, Sharma A, Watts D J. Prediction and Explanation in Social Systems[J].Science, 2017, 355(6324):486-488.

[21]Wei J, Wang X, Schuurmans D, et al. Chain-of-Thought Prompting Elicits Reasoning in Large Language Models[C]//Advances in Neural Information Processing Systems35(NeurIPS 2022). New Orleans:Curran Associates, Inc., 2022:24824-24837.

[22]Rudin C. Stop Explaining Black Box Machine Learning Models for High Stakes Decisions and Use Interpretable Models Instead[J].Nature Machine Intelligence, 2019, 1(5):206-215.

[23]Merton R K. The Self-Fulfilling Prophecy[J].The Antioch Review,1948, 8(2):193-210.

[24]Schölkopf B, Locatello F, Bauer S, et al. Toward Causal Representation Learning[J].Proceedings of the IEEE, 2021, 109(5):612-634.

[25]Pearl J, Mackenzie D.The Book of Why:The New Science of Cause and Effect[M]. New York:Basic Books, 2018:28.

[26]Ouyang L, Wu J, Jiang X, et al. Training Language Models to Follow Instructions with Human Feedback[C]//Advances in Neural Information Processing Systems35(NeurIPS 2022). New Orleans:Curran Associates, Inc., 2022:27730-27744.

基本信息:

DOI：10.19484/j.cnki.1000-8934.2026.05.005

中图分类号:TP18

引用信息:

[1]杜娟,李晓义.从大数据到大模型：人工智能如何重塑计算社会科学的研究范式？[J].自然辩证法研究,2026,42(05):42-52.DOI:10.19484/j.cnki.1000-8934.2026.05.005.

基金信息:

国家社会科学基金一般项目“‘挤出’还是‘互补’?——物质激励与社会偏好之间相互作用的实验研究”(23BJL124)

发布时间：

2026-05-18

出版时间：

2026-05-18

请选择需要下载的pdf数据

自然辩证法研究

STUDIES IN DIALECTICS OF NATURE

使用微信“扫一扫”功能。
将此内容分享给您的微信好友或者朋友圈

引用

GB/T 7714-2015 格式引文

MLA格式引文

APA格式引文

请选择需要下载的pdf数据

自然辩证法研究

STUDIES IN DIALECTICS OF NATURE

使用微信“扫一扫”功能。将此内容分享给您的微信好友或者朋友圈

引用

使用微信“扫一扫”功能。
将此内容分享给您的微信好友或者朋友圈