David Sun

DC
h-index45
5papers
395citations
Novelty22%
AI Score29

5 Papers

LGJan 24, 2025
Humanity's Last Exam

Long Phan, Alice Gatti, Ziwen Han et al. · amazon-science, apple-ml

Benchmarks are important tools for tracking the rapid advancements in large language model (LLM) capabilities. However, benchmarks are not keeping pace in difficulty: LLMs now achieve over 90\% accuracy on popular benchmarks like MMLU, limiting informed measurement of state-of-the-art LLM capabilities. In response, we introduce Humanity's Last Exam (HLE), a multi-modal benchmark at the frontier of human knowledge, designed to be the final closed-ended academic benchmark of its kind with broad subject coverage. HLE consists of 2,500 questions across dozens of subjects, including mathematics, humanities, and the natural sciences. HLE is developed globally by subject-matter experts and consists of multiple-choice and short-answer questions suitable for automated grading. Each question has a known solution that is unambiguous and easily verifiable, but cannot be quickly answered via internet retrieval. State-of-the-art LLMs demonstrate low accuracy and calibration on HLE, highlighting a significant gap between current LLM capabilities and the expert human frontier on closed-ended academic questions. To inform research and policymaking upon a clear understanding of model capabilities, we publicly release HLE at https://lastexam.ai.

DCMay 2, 2019Code
Real Differences between OT and CRDT in Building Co-Editing Systems and Real World Applications

David Sun, Chengzheng Sun, Agustina Ng et al.

OT (Operational Transformation) was invented for supporting real-time co-editors in the late 1980s and has evolved to become a core technique used in today's working co-editors and adopted in major industrial products. CRDT (Commutative Replicated Data Type) for co-editors was first proposed around 2006, under the name of WOOT (WithOut Operational Transformation). Follow-up CRDT variations are commonly labeled as "post-OT" techniques and have made broad claims of superiority over OT solutions, in terms of correctness, time and space complexity, simplicity, etc. Over one decade later, however, OT remains the choice for building the vast majority of co-editors, whereas CRDT is rarely found in working co-editors. Why? To seek truth from facts, we set out to conduct a comprehensive and critical review of representative OT and CRDT solutions and working co-editors based on them. From this work, we have made important discoveries about OT and CRDT, and revealed facts and evidences that refute CRDT claims over OT on all accounts. We present our discoveries in three related and complementary articles. In prior two articles, we have revealed the similarities of OT and CRDT in following the same general transformation approach in co-editors, and their real differences in correctness and complexity. In this article, we examine the role of building working co-editors in shaping OT and CRDT research and solutions, and consequential differences in the choice between OT and CRDT in real world co-editors and industry products. In particular, we review the evolution of co-editors from research vehicles to real world applications, and discuss representative OT-based co-editors and alternative approaches in industry products and open source projects. Moreover, we evaluate CRDT-based co-editors in relation to published CRDT solutions, and clarify some myths surrounding "peer-to-peer" co-editing.

DCMay 2, 2019
Real Differences between OT and CRDT in Correctness and Complexity for Consistency Maintenance in Co-Editors

David Sun, Chengzheng Sun, Agustina Ng et al.

OT (Operational Transformation) was invented for supporting real-time co-editors in the late 1980s and has evolved to become core techniques widely used in today's working co-editors and adopted in industrial products. CRDT (Commutative Replicated Data Type) for co-editors was first proposed around 2006, under the name of WOOT (WithOut Operational Transformation). Follow-up CRDT variations are commonly labeled as "post-OT" techniques capable of making concurrent operations natively commutative in co-editors. On top of that, CRDT solutions have made broad claims of superiority over OT solutions, and often portrayed OT as an incorrect and inefficient technique. Over one decade later, however, CRDT is rarely found in working co-editors; OT remains the choice for building the vast majority of today's co-editors. Contradictions between the reality and CRDT's purported advantages have been the source of much confusion and debate among co-editing researcher sand developers. To seek truth from facts, we set out to conduct a comprehensive and critical review on representative OT and CRDT solutions and co-editors based on them. From this work, we have made important discoveries about OT and CRDT, and revealed facts and evidences that refute CRDT claims over OT on all accounts. These discoveries help explain the underlying reasons for the choice between OT and CRDT in the real world. We report these results in a series of three articles. In the second article of this series, we reveal the differences between OT and CRDT in their basic approaches to realizing the same general transformation and how such differences had resulted in different challenges and consequential correctness and complexity issues. Moreover, we reveal hidden complexity and algorithmic flaws with representative CRDT solutions, and discuss common myths and facts related to correctness and complexity of OT and CRDT.

DCMay 2, 2019
Real Differences between OT and CRDT under a General Transformation Framework for Consistency Maintenance in Co-Editors

Chengzheng Sun, David Sun, Agustina et al.

OT (Operational Transformation) was invented for supporting real-time co-editors in the late 1980s and has evolved to become a core technique used in today's working co-editors and adopted in major industrial products. CRDT (Commutative Replicated Data Type) for co-editors was first proposed around 2006, under the name of WOOT (WithOut Operational Transformation). Follow-up CRDT variations are commonly labeled as "post-OT" techniques capable of making concurrent operations natively commutative in co-editors. On top of that, CRDT solutions have made broad claims of superiority over OT solutions, and routinely portrayed OT as an incorrect, complex and inefficient technique. Over one decade later, however, OT remains the choice for building the vast majority of co-editors, whereas CRDT is rarely found in working co-editors. Contradictions between the reality and CRDT's purported advantages have been the source of much confusion and debate in co-editing communities. Have the vast majority of co-editors been unfortunate in choosing the faulty and inferior OT, or those CRDT claims are false? What are the real differences between OT and CRDT for co-editors? What are the key factors and underlying reasons behind the choices between OT and CRDT in the real world? To seek truth from facts, we set out to conduct a comprehensive and critical review on representative OT and CRDT solutions and working co-editors based on them. From this work, we have made important discoveries about OT and CRDT, and revealed facts and evidences that refute CRDT claims over OT on all accounts. We report our discoveries in a series of three articles and the current article is the first one in this series. We hope the discoveries from this work help clear up common misconceptions and confusions surrounding OT and CRDT, and accelerate progress in co-editing technology for real world applications.

DCOct 4, 2018
Real Differences between OT and CRDT for Co-Editors

Chengzheng Sun, David Sun, Agustina et al.

OT (Operational Transformation) was invented for supporting real-time co-editors in the late 1980s and has evolved to become a core technique used in today's working co-editors and adopted in major industrial products. CRDT (Commutative Replicated Data Type) for co-editors was first proposed around 2006, under the name of WOOT (WithOut Operational Transformation). Follow-up CRDT variations are commonly labeled as "post-OT" techniques capable of making concurrent operations natively commutative, and have made broad claims of superiority over OT solutions, in terms of correctness, time and space complexity, simplicity, etc. Over one decade later, however, CRDT solutions are rarely found in working co-editors, while OT solutions remain the choice for building the vast majority of co-editors. Contradictions between this reality and CRDT's purported advantages have been the source of much debate and confusion in co-editing research and developer communities. What is CRDT really to co-editing? What are the real differences between OT and CRDT for co-editors? What are the key factors that may have affected the adoption of and choice between OT and CRDT for co-editors in the real world? In this paper, we report our discoveries, in relation to these questions and beyond, from a comprehensive review and comparison study on representative OT and CRDT solutions and working co-editors based on them. Moreover, this work reveals facts and presents evidences that refute CRDT claimed advantages over OT. We hope the results reported in this paper will help clear up common myths, misconceptions, and confusions surrounding alternative co-editing techniques, and accelerate progress in co-editing technology for real world applications.