| 0 | 0 | 41 |
| 下载次数 | 被引频次 | 阅读次数 |
计算社会科学正经历一场由“大数据驱动”向“大模型驱动”的深刻范式革命。这场以大语言模型为核心的变革不仅是工具的迭代,更是一场重塑社会科学研究方法、认识论与权力结构的结构性变革。在方法论层面,研究全流程被根本性地改写:强大的“向量化表征”取代了透明的操作化指标,在提升测量效度的同时也带来了“有效性黑箱”;精细的文化结构描绘能力伴随着“算法幻觉”的风险;“人机共创”的理论构建新模式则面临着深刻洞见被精致平庸所取代的“理论退化”危机。这些变革共同指向了两个核心困境:在认识论上,表现为“预测”对“解释”的压倒性优势,社会科学面临丧失其因果解释使命的风险;在权力结构上,表现为知识生产工具与能力向少数科技平台的高度集中,学术共同体的自主性与批判性正遭受侵蚀。同时,研究的伦理焦点也从“数据隐私”转向了由模型直接造成的“算法伤害”与“价值对齐”争议。面对挑战,社会科学的回应之道在于培养一种能够连接模型技术输出与背后社会结构的“新社会学的想象力”。
Abstract:Computational Social Science( CSS) is undergoing a fundamental paradigm shift from a “big data-driven” to a “ large model-driven” approach. This article argues that the revolution spearheaded by Large Language Models(LLMs) is a structural transformation rather than a mere iterative tool upgrade, fundamentally reshaping the methodology, epistemology, and power dynamics of the social sciences. Methodologically, the transition toward“vectorized representation” over transparent operational indicators has introduced a “validity black box, ” where enhanced measurement masks a loss of clarity. Furthermore, the shift toward human-AI theoretical co-creation faces a “theoretical degradation, ” where profound insights risk being supplanted by polished mediocrity. These transformations point to two core crises: an epistemological imbalance where prediction eclipses causal explanation, and a concentration of knowledge-production power within tech platforms that erodes academic autonomy. Ethically, concerns have pivoted from data privacy to direct algorithmic harm and the complexities of value alignment. Ultimately, this paper advocates for a “new sociological imagination” capable of bridging model outputs with underlying social structures.
[1]Lazer D, Pentland A, Adamic L, et al. Computational Social Science[J].Science, 2009, 323(5915):721-723.
[2]Lazer D, Hargittai E, Freelon D, et al. Computational Social Science:Obstacles and Opportunities[J].Science, 2020, 369(6507):1060-1062.
[3]Bender E M, Gebru T, McMillan-Major A, et al. On the Dangers of//Stochastic Parrots:Can Language Models Be too Big?[C] Proceedings of the 2021 ACM Conference on Fairness, Accountability,and Transparency(FAccT'21).New York:ACM, 2021:610-623.
[4]Savage M, Burrows R. The Coming Crisis of Empirical Sociology[J].Sociology, 2007, 41(5):885-899.
[5]Blei D M. Probabilistic Topic Models[J].Communications of the ACM, 2012, 55(4):77-84.
[6]Goel S, Hofman J M, Lahaie S, et al. Predicting Consumer Behavior with Web Search[J].Proceedings of the National Academy of Sciences, 2010, 107(41):17486-17490.
[7]Conte R, Gilbert N, Bonelli G, et al. Manifesto of Computational Social Science[J].The European Physical Journal Special Topics,2012, 214(1):325-346.
[8]Tufekci Z. Big Questions for Social Media Big Data:Representative-//ness, Validity and Other Methodological Pitfalls[C] Proceedings of the 8th International AAAI Conference on Weblogs and Social Media(ICWSM-14). Ann Arbor:AAAI Press, 2014:505-514.
[9]Brown T B, Mann B, Ryder N, et al. Language Models are Few-//Shot Learners[C] Advances in Neural Information Processing Systems 33(NeurIPS 2020). Virtual:Curran Associates, Inc., 2020:1877-1901.
[10]Arora S, Li Y, Liang Y, et al. A Latent Variable Model Approach to PMI-based Word Embeddings[J].Transactions of the Association for Computational Linguistics, 2018, 6:513-527.
[11]Brynjolfsson E, McAfee A. The Turing Trap:The Promise&Peril of Human-Like Artificial Intelligence[J].Daedalus, 2022, 151(2):272-287.
[12]Park J S, O’Brien J C, Cai C J, et al. Generative Agents:Inter-//active Simulacra of Human Behavior[C] Proceedings of the 36th Annual ACM Symposium on User Interface Software and Technology(UIST'23). San Francisco:ACM, 2023:1-22.
[13]Mikolov T, Sutskever I, Chen K, et al. Distributed Representa-//tions of Words and Phrases and Their Compositionality[C] Advances in Neural Information Processing Systems 26(NIPS 2013).Lake Tahoe:Curran Associates, Inc., 2013:3111-3119.
[14]Bolukbasi T, Chang K W, Zou J Y, et al. Man is to Computer Programmer as Woman is to Homemaker? Debiasing Word Embeddings//[C] Advances in Neural Information Processing Systems29(NIPS2016).Barcelona:Curran Associates, Inc., 2016:4349-4357.
[15]Adamic L A, Glance N. The Political Blogosphere and the 2004 U.S. Election:Divided They Blog[C]//Proceedings of the3rd international workshop on Link discovery(LinkKDD'05). Chicago:ACM, 2005:36-43.
[16]Garg N, Schiebinger L, Jurafsky D, et al. Word Embeddings Quantify 100 Years of Gender and Ethnic Stereotypes[J].Proceedings of the National Academy of Sciences, 2018, 115(16):E3635-E3644.
[17]Kozlowski A C, Taddy M, Evans J A. The Geometry of Culture:Analyzing the Meanings of Class through Word Embeddings[J].American Sociological Review, 2019, 84(5):905-949.
[18]Birhane A, Kasirzadeh A, Ritchie D, et al. Science in the Age of Large Language Models[J].Nature Reviews Physics, 2023, 5:277-280.
[19]Breiman L. Statistical Modeling:the Two Cultures[J].Statistical Science, 2001, 16(3):199-231.
[20]Hofman J M, Sharma A, Watts D J. Prediction and Explanation in Social Systems[J].Science, 2017, 355(6324):486-488.
[21]Wei J, Wang X, Schuurmans D, et al. Chain-of-Thought Prompting Elicits Reasoning in Large Language Models[C]//Advances in Neural Information Processing Systems35(NeurIPS 2022). New Orleans:Curran Associates, Inc., 2022:24824-24837.
[22]Rudin C. Stop Explaining Black Box Machine Learning Models for High Stakes Decisions and Use Interpretable Models Instead[J].Nature Machine Intelligence, 2019, 1(5):206-215.
[23]Merton R K. The Self-Fulfilling Prophecy[J].The Antioch Review,1948, 8(2):193-210.
[24]Schölkopf B, Locatello F, Bauer S, et al. Toward Causal Representation Learning[J].Proceedings of the IEEE, 2021, 109(5):612-634.
[25]Pearl J, Mackenzie D.The Book of Why:The New Science of Cause and Effect[M]. New York:Basic Books, 2018:28.
[26]Ouyang L, Wu J, Jiang X, et al. Training Language Models to Follow Instructions with Human Feedback[C]//Advances in Neural Information Processing Systems35(NeurIPS 2022). New Orleans:Curran Associates, Inc., 2022:27730-27744.
基本信息:
DOI:10.19484/j.cnki.1000-8934.2026.05.005
中图分类号:TP18
引用信息:
[1]杜娟,李晓义.从大数据到大模型:人工智能如何重塑计算社会科学的研究范式?[J].自然辩证法研究,2026,42(05):42-52.DOI:10.19484/j.cnki.1000-8934.2026.05.005.
基金信息:
国家社会科学基金一般项目“‘挤出’还是‘互补’?——物质激励与社会偏好之间相互作用的实验研究”(23BJL124)
2026-05-18
2026-05-18